I’m building an application that pulls lat/long values from a database and plots them on a Google Map. There could be thousands of data points so I “cluster” points close to each other so the user is not overwhelmed with icons. At the moment I perform this clustering in the application, with a simple algorithm like this:
- Get array of all points
- Pop first point off array
- Compare first point to all other points in array looking for ones that fall within x distance
- Create a cluster with the original and close points.
- Remove close points from array
- Repeat
Now I release this is inefficient and is the reason I have been looking into GIS systems. I have set up PostGIS and have my lat & longs stored in a POINT geometry object.
Can someone get me started or point me to some resources on a simple implementation of this clustering algorithm in PostGIS?
I ended up using a combination of snaptogrid and avg. I realize there are algorithms out there (i.e. kmeans as Denis suggested) that will give me better clusters but for what I’m doing this is fast and accurate enough.