I’ve got a table that is similar to below in MS SQL Server.
id | Timestamp | active
-----+-----------+--------
1 | 1:00 | 1
1 | 2:00 | 1
1 | 3:00 | 1
1 | 4:00 | 0
1 | 5:00 | 0
1 | 6:00 | 1
1 | 7:00 | 0
1 | 8:00 | 0
1 | 9:00 | 0
1 | 10:00 | 1
1 | 11:00 | 1
1 | 12:00 | 0
1 | 13:00 | 1
2 | 2:00 | 1
2 | 3:00 | 1
2 | 4:00 | 0
2 | 5:00 | 0
3 | 8:00 | 0
3 | 9:00 | 0
4 | 1:00 | 1
4 | 2:00 | 1
5 | 16:00 | 0
What I want to do find out when each ID was inactive (active = 0) for how long. What I tried to do was group it by id when active = 0, and do a datediff on the min and max time. But that would give me a result for id 1 that said it was offline for 8 hours (12:00 – 4:00) @ 12:00. When I really want is a query that will give me the following result set.
id | approx. offline in hours | at time
---+--------------------------+-----------
1 | 1 | 5:00
1 | 2 | 9:00
1 | 0 | 12:00
2 | 1 | 5:00
3 | 0 | 9:00
5 | 0 | 16:00
The wrong query I initially tried is
SELECT id as [Inactive],
DATEDIFF(hour, MIN(Timestamp), MAX(Timestamp)) as [approx. offline in hours],
MAX(Timestamp) as [at time]
FROM table
WHERE active = 0
GROUP BY [Inactive]
But the problem with that query is that it skips the Active times in between. I’ve been looking at THIS question that’s been asked and answered using PARTITION, but it looks like the question is different enough and the answer is too specific to the question that I can’t make sense of it.
Any help is appreciated.
One way to approach this, that works in any database, is to use a correlated subquery. The idea is to assign a groupname to each consecutive string of active values. The particular groupname is the time of the next change in value.
One caveats how you get turn timestamps into durations depends on the database. Since you don’t specify the database, I’ll let you add that logic.
Also, if the last record for a given id is inactive, the groupname is NULL. That is not a problem.