I’m looking to find how many grouped gaps exist for a given time range.
starting range: 2012-01-12 00:00:00
ending range: 2012-01-18 59:59:59
Which translates roughly to:
type 10 11 12 13 14 15 16 17 18 19 20
a |--========]
a |==------]
b |==============--]
c |-----===========]
d |--=====================------]
the same data grouped by type:
a |--========] |==------]
b |==============--]
c |-----===========]
d |--=====================------]
Resulting in:
type gap
---------
a 1 (yes)
b 1 (yes)
c 1 (yes)
d 0 (no)
And eventually…
SUM(gap) AS gaps
----------------
3
UPDATE for clarification:
Data is stored with start and end timestamps per type. For example:
id type start_datetime end_datetime
--------------------------------------------------
1 a 2012-01-11 00:00:00 2012-01-14 59:59:59
2 a 2012-01-18 00:00:00 2012-01-20 59:59:59
3 b 2012-01-14 00:00:00 2012-01-19 59:59:59
4 c 2012-01-10 00:00:00 2012-01-15 59:59:59
5 d 2012-01-11 00:00:00 2012-01-20 59:59:59
Here’s a variant on wildplasser’s answer that uses windows instead of a CTE. Based on the same test fixture:
This is based on using sum() as a window aggregate adding 1 for a range start and subtracting 1 for a range end, and then looking for points where the running sum goes to 0 within the target range. I had to do much the same thing as wildplasser did, add a couple of extra entries that don’t contribute anything at the endpoints of the boundary so that groups where there is nothing covering the boundary are found…
This seems to cost less on the test data, but I think it might be highly dependent on not having much data in the tables to go through. With some rearranging (which would make it even harder to read) it can work off just two full scans of tmp.gaps (one of which is just getting distinct ztypes).