I have a database in Microsoft SQL Server 2008
I have a table [eventlog].[dbo].[USER_OPERATION] with columns [userID], [event_description], [event_date], [event_type], [eventID]
eventID is unique for, well, every given event that can happen.
A given UserID is, of course, not unique for events (every user may have a lot of events associated with him) but is associated with a single individual user
What I want is to create a query that would give me a list which only has the latest event for every user (that is to say, every UserID) and its associated info (the event_type, eventID and event_description of a given specific event).
To illustrate:
When performing
SELECT *
FROM [eventlog].[dbo].[USER_OPERATION]
ORDER BY userID ASC
I get something along the lines of
|====================================================================================|
| eventID | userID | event_description | event_date | event_type |
| | | | | |
| 123 | 2 | USER 2 broke something | 03.11.11 | CRASH |
| 391 | 2 | USER 2 filed a complaint | 30.04.10 | COMPLAINT |
| 392 | 2 | USER 2 has bought beer | 31.10.09 | PURCHASE |
| 32 | 3 | USER 3 broke something | 22.10.11 | CRASH |
| 568 | 4 | USER 4 has requested support | 05.12.11 | SUPP_REQ |
| 691 | 4 | USER 4 has bought beer | 01.12.10 | PURCHASE |
| 81 | 4 | USER 4 updated personal data | 17.07.11 | PDAT_UPD |
| 141 | 5 | USER 5 has bought beer | 16.08.11 | PURCHASE |
| 142 | 5 | USER 5 broke something | 16.08.11 | CRASH |
| 269 | 6 | USER 6 updated personal data | 27.01.12 | PDAT_UPD |
| 845 | 7 | USER 7 updated personal data | 27.01.12 | PDAT_UPD |
| | | | | |
|====================================================================================|
As you can see, some users have multiple events with different dates associated with them.
What I want is a query that will show me a list of users and what was the most recent date users had an event on (the output would, essentially, be a table which lists each users once, and shows the most recent event associated with each user, and associated event_description, event_date and event_type info).
Let’s call such a query result “Recent Event Table”.
Do note the “unusual condition” of User 5 (UserID 5) who broke something and bought beer at the same date.
In such cases, I don’t care WHICH of the two same-day events would go into the “Recent Event Table”, it can be picked by random or whatever (though I would still would need the associated event_description and event_type info).
Ideally, the result would look like this (for same set of users as above):
|====================================================================================|
| eventID | userID | event_description | event_date | event_type |
| | | | | |
| 123 | 2 | USER 2 broke something | 03.11.11 | CRASH |
| 32 | 3 | USER 3 broke something | 22.10.11 | CRASH |
| 568 | 4 | USER 4 has requested support | 05.12.11 | SUPP_REQ |
| 141 | 5 | USER 5 has bought beer | 16.08.11 | PURCHASE |
| 269 | 6 | USER 6 updated personal data | 27.01.12 | PDAT_UPD |
| 845 | 7 | USER 7 updated personal data | 27.01.12 | PDAT_UPD |
| | | | | |
|====================================================================================|
If there is no way to just “pick any one of the two randomly or per some rule” for such “date-dupes” as User 5, having two entries in the “Recent Event Table” would be acceptable for such special cases, since they are very rare and I can handle them manually.
In such (a little bit less fortunate) case the “Recent Event Table” would look like
|====================================================================================|
| eventID | userID | event_description | event_date | event_type |
| | | | | |
| 123 | 2 | USER 2 broke something | 03.11.11 | CRASH |
| 32 | 3 | USER 3 broke something | 22.10.11 | CRASH |
| 568 | 4 | USER 4 has requested support | 05.12.11 | SUPP_REQ |
| 141 | 5 | USER 5 has bought beer | 16.08.11 | PURCHASE |
| 142 | 5 | USER 5 broke something | 16.08.11 | CRASH |
| 269 | 6 | USER 6 updated personal data | 27.01.12 | PDAT_UPD |
| 845 | 7 | USER 7 updated personal data | 27.01.12 | PDAT_UPD |
| | | | | |
|====================================================================================|
Which is also acceptable (but would need a bit additional pruning later).
So, to summarize my question, is it possible to construct such a Microsoft SQL query that would give me a Recent Event Table along the lines of what was described above ?
Thank you very much for your help in advance
You can use a CTE (Common Table Expression) combined with the
ROW_NUMBER()function:This “partitions” your data into groups – one each for every
UserID– and then sorts the events inside that group of data byevent_date DESCandevent_id DESC) and numbers them – the most recent entry (for each user) getsRowNum = 1– so just select those from the CTE and you’re done!