Assuming a table such as:
UID Name Datetime Users
4 Room 4 2012-08-03 14:00:00 3
2 Room 2 2012-08-03 14:00:00 3
3 Room 3 2012-08-03 14:00:00 1
1 Room 1 2012-08-03 14:00:00 2
3 Room 3 2012-08-03 14:15:00 1
2 Room 2 2012-08-03 14:15:00 4
1 Room 1 2012-08-03 14:15:00 3
1 Room 1 2012-08-03 14:30:00 6
1 Room 1 2012-08-03 14:45:00 3
2 Room 2 2012-08-03 14:45:00 7
3 Room 3 2012-08-03 14:45:00 8
4 Room 4 2012-08-03 14:45:00 4
I wanted to get the average user count of each room (1,2,3,4) from the time 2PM to 3PM. The problem is that sometimes the room may not “check in” at the 15 minute interval time, so the assumption has to be made that the previous last known user count is still valid.
For example the check-in’s for 2012-08-03 14:15:00 room 4 never checked in, so it must be assumed that room 4 had 3 users at 2012-08-03 14:15:00 because that is what it had at 2012-08-03 14:00:00
This follows on through so that the average user count I am looking for is as follows:
Room 1: (2 + 3 + 6 + 3) / 4 = 3.5
Room 2: (3 + 4 + 4 + 7) / 4 = 4.5
Room 3: (1 + 1 + 1 + 8) / 4 = 2.75
Room 4: (3 + 3 + 3 + 4) / 4 = 3.25
where # is the assumed number based on the previous known check-in.
I am wondering if it’s possible to so this with SQL alone? if not I am curious of a ingenious PHP solution that isn’t just bruteforce math, as such as my quick inaccurate pseudo code:
foreach ($rooms_id_array as $room_id) {
$SQL = "SELECT * FROM `table` WHERE (`UID` == $room_id && `Datetime` >= 2012-08-03 14:00:00 && `Datetime` <= 2012-08-03 15:00:00)";
$result = query($SQL);
if ( count($result) < 4 ) {
// go through each date and find what is missing, and then go to previous date and use that instead
} else {
foreach ($result)
$sum += $result;
$avg = $sum / 4;
}
}
You can use this solution:
Query Breakdown:
Step 1:
The first thing we need to do is associate each room with each time-interval. For example, in your example data,
Room 4does not have an association with intervals14:15:00and14:30:00, but we still need to somehow represent those associations.We accomplish this by creating a Cartesian product of each distinct room with the relevant time-intervals:
Renders:
SQLFiddle Demo
Step 2:
Then once we have those associations, we join the result back onto the main table (
tbl) on the condition that the main table’s time part of itsDatetimefield is less than the Cartesian-joined time for eachUID. What this will do is for eachUID->intrvlassociation, it will show all entries that have occurred on or before theintrvltime.So for example, since
Room 3doesn’t have an entry for the14:30:00intrvl, only two entries will join with that intrvl: the ones on14:15:00and14:00:00since they both occurred either on or before the intrvl time.You can now see where we are going with this. The result of this step will give us access to the most recent entry for each intrvl.
Renders (excluding the
Namecolumn):SQLFiddle Demo
Step 3:
Our next step is to take the result-set above and pull only the most recent joined
Datetimefor each intrvl. We can accomplish this by usingGROUP BYin conjunction with theMAX()aggregate function.Unfortunately, we can’t also correctly pull the value of
Usersalong with each of the selectedDatetimes due to howGROUP BYbehaves.Renders:
SQLFiddle Demo
Step 4
Now we have to grab the value of
Usersfor eachlast_dateso we can take the average of those values. We do this by wrapping our query in the last step as a subselect inside theFROMclause and joining once again back onto the main table on the condition that for each matchingUID->last_dateassociation, grab the value ofUsers.Renders:
SQLFiddle Demo
Step 5
Now it’s just a simple matter of grouping on each room and averaging the
Userscolumn:Renders:
SQLFiddle Demo of Final Result