I am trying to write a query to removed duplicates records from the following table (valid_columns) and keep only the records with the lowest possible [order] number.
For example in the following table I would like to remove duplicate rows, region 2,3 and job 3 and keep the records with the lowest possible [order].
E.g. The input table, valid_columns looks like this:
name col_order
-------------
job 1
job 3
status 2
cust 2
county 1
state 1
region 1
region 2
region 3
so 4
Desired Output:
name col_order
-------------
job 1
status 2
cust 2
county 1
state 1
region 1
so 4
I am trying to fix a bug and I can’t figure out the SQL. Currently it uses a delete statment and a subquery. The query used at the moment looks like this:
— 3) Remove duplicated columns
DELETE
FROM valid_columns
WHERE NOT ( col_order = ( SELECT TOP 1 col_order
FROM valid_columns firstValid
WHERE name = firstValid.name
AND col_order = firstValid.col_order
ORDER BY col_order ASC ))
However, this only returns the following, which is incorrect:
name col_order
-------------
job 1
county 1
state 1
region 1
Many thanks
EDIT:
can be simplified to this:
The DELETE statement can have a FROM clause to delete a record based on the value of a related record in a second table. In this case the FROM is not really required (I sometimes use the FROM to alias the table name because I don’t like the extra typing.)
You could also try this example (might be faster if you have to do this a lot):