I have a SQL Server 2008 database where I would like to be able to do the following on one table.
The table has multiple columns that should be unique based on a combination of two columns.
We will call them [ID1] and [ID2], then there is a key we will call [index] and a value that could be duplicated called [ID3] and a datetime value called [start].
So here is the dilemma, within the scope of the table there are should only be one increasing value of [index] for every [ID1 ]and [ID2] combination, the three form a natural PK from the client DB that is being warehoused in one consolidated server database.
ID3 is indicative of a value that was used to determine when a row was stored in the client DB so there may be duplicates in the server
[ID1] [ID2] [index] [ID3] [start] [other1] [other2]
1 1 1 1 01/01/2000 01:00:00 5 6
1 1 2 2 01/01/2000 01:00:01 4 2
1 1 3 3 01/01/2000 01:00:02 5 2
1 1 4 3 01/01/2000 01:00:03 5 2
1 1 5 4 01/01/2000 01:00:04 4 6
What I want is a query that will return rows that are unique combinations of [ID3] and the [other1] & [other2] columns, unique to the [ID1], [ID2] key, with that I would like the first [start] that fits that criteria, essentially ignoring further occurrences of the same distinct clause.
From the above table it would return …
[ID1] [ID2] [index] [ID3] [start] [other1] [other2]
1 1 1 1 01/01/2000 01:00:00 5 6
1 1 2 2 01/01/2000 01:00:01 4 2
1 1 3 3 01/01/2000 01:00:02 5 2
1 1 5 4 01/01/2000 01:00:04 4 6
The second row with the [ID3] of the value 3 would be ignored, as would any other that had duplicates of [ID3]
The point I cannot seem to get to is the first value of each distinct combination, because the distinct does not allow me to select the value of the other columns, a group by would require some aggregate function.
1 Answer