I using Microsoft SQL Server 2008 to try and identify the chronological order of data points in order to create a filter field that will allow me to create a query that only includes the first and last record for each ID number, where multiple rows represent different data points from the same ID
Here is an example of my current data and desired data to give a better idea of what I mean:
Current Data
ID Indicator Date
1 1 1988-02-11
1 1 1989-03-9
1 1 1993-04-3
1 1 2001-05-4
2 1 2000-01-01
2 1 2001-02-03
2 1 2002-04-22
3 1 1990-02-01
3 1 1998-02-01
3 1 1999-03-02
3 1 2000-04-02
4 0 NA
Desired Data
ID Indicator Date Order_Indicator
1 1 1988-02-11 1
1 1 1989-03-9 2
1 1 1993-04-3 3
1 1 2001-05-4 4
2 1 2000-01-01 1
2 1 2001-02-03 2
2 1 2002-04-22 3
3 1 1990-02-01 1
3 1 1998-02-01 2
3 1 1999-03-02 3
3 1 2000-04-02 4
4 0 NULL NULL
The field I want to create is the “Order_Indicator” field in the “Desired Data” table and with the only relevant records are records with Indicator = 1. With this information I would create a query where I only select the rows where Order_Indicator = 1 and Order_Indicator = MAX(Order_Indicator) for each “row group” that share the same ID. Does anyone have any idea about how I might go about this? I know I could do this very easily in Excel but I need to do it on SQL server in order for it to be reproducible with my colleagues.
Thank you so much in advance!
You can do this with the ranking functions:
This assigns a sequential number based on the date and indicator. The case statement takes care of the indicator = 0 case.
By the way, this assumes that “date” is being stored as a date.