I’ve got a postgres 8.3 database of hotels, each with an associated longitude and latitude stored as a point, and a resort stored as a resort id. I’d like to find the central or average point of the resort.
I can do this using a simple query:
select
avg(lat_long[0]) as latitude,
avg(lat_long[1]) as longitude,
resort_id
from accomm
group by resort_id
However, there is some bad data in the database, for example there might be an American hotel that is recorded in a European resort. Obviously doing a simple average this data will mean the results are inaccurate.
How can I calculate an interquartile mean, or similar method to filter out this bad data? I’ve currently got about 30,000 rows in my table.
Are all of your hotels in the United States? It seems to me that it might be easier to create a bounding box and just disregard any lat/long combos that are outside of this range.
The biggest drawback of this is that it’s not super precise. Basically you can exclude locations in Europe but something on the US/Canada border will probably not get excluded…