I have a a PostgreSQL database table with the following simplified structure:
- Device Id varchar
- Pos_X (int)
- Pos_Y (int)
Basically this table contains a lot of two dimensional waypoint data for devices. Now I want to design a query which reduces the number of coordinates in the output. It should aggregate nearby coordinates (for a certain x,y threshold)
An example:
row 1: DEVICE1;603;1205
row 2: DEVICE1;604;1204
If the threshold is 5, these two rows should be aggregated since the variance is smaller than 5.
Any idea how to do this in PostgreSQL or SQL in general?
Use the often overlooked built-in function
width_bucket()in combination with your aggregation:If your coordinates run from, say, 0 to 2000 and you want to consolidate everything within squares of 5 to single points, I would lay out a grid of 10 (5*2) like this:
To minimize the error you could
GROUP BYthe grid as demonstrated, but save actual average coordinates:sqlfiddle demonstrating both alongside.
Well, this particular case could be simpler:
But that’s just because the demo grid size of
10conveniently matches the decimal system. Try the same with a grid size of17or something …Expand to timestamps
You can expand this approach to cover
dateandtimestampvalues by converting them to unix epoch (number of seconds since ‘1970-1-1’) with extract().When you are done, convert the result back to
timestamp with time zone:Or simply
to_timestamp():