I have a panel data set: that is, times, ids, and values. I would like to do a ranking based on value for each date. I can achieve the sort very simply by running:
select * from tbl order by date, value
The issue I have is once the table is sorted in this way, how do I retrieve the row number of each group (that is, for each date I would like there to be a column called ranking that goes from 1 to N).
Example:
Input:
Date, ID, Value
d1, id1, 2
d1, id2, 1
d2, id1, 10
d2, id2, 11
Output:
Date, ID, Value, Rank
d1, id2, 1, 1
d1, id1, 2, 2
d2, id1, 10, 1
d2, id2, 11, 2
Absent window functions, you can order
tbland use user variables to compute rank over your partitions (“date” values) yourself:Update
So, what is that query doing?
We are using user variables to “loop” through a sorted result set, incrementing or resetting a counter (
@rank) depending upon which contiguous segment of the result set (tracked in@partition) we’re in.In query A we initialize two user variables. In query B we get the records of your table in the order we need: first by date and then by value. A and B together make a derived table,
tbl_ordered, that looks something like this:Remember, we don’t really care about the columns
dummy.rankanddummy.partition— they’re just accidents of how we initialize the variables@rankand@partition.In query C we loop through the derived table’s records. What we’re doing is more-or-less what the following pseudocode does:
Finally, query D projects all columns from C except for the column holding
@partition(which we nameddummyand do not need to display).