I have a database that looks like this:
label_id, scan_type, scan_cost
And some rows that look like this:
001, A40, 70
001, A40, 70
002, A40, 85
003, A40, 85
003, A40, 85
I need to produce a result set that looks like this:
001, A40, 70
001, A40, 0
002, A40, 85
003, A40, 85
003, A40, 0
That is, any multiple occurrence of the same label_id then the scan_cost column needs to be set to 0, but the first occurrence of each label_id the value needs to remain untouched.
The label_id are not sequential if that changes anything.
Is it possible to achieve this behaviour in SQL? Note the SQL dialect is T-SQL, Microsoft SQL Server 2008
This query returns what you are looking for:
The idea behind this solution is to partition the data by
label_id, and use theROW_NUMBERfunction to decide which data to keep.I used
scan_typein theORDER BYsub-clause, which is not ideal. If your actual table has a column with data that is better suited to determine what row is first, e.g. a timestamp column, you should use that other column instead. The second column in the outerorder byneeds to be the same as the column in the innerorder by.Here is a link to this query on sqlfiddle.