I want to get redundant records from the database. Is my query correct for this?
select (fields)
from DB
group by name, city
having count(*) > 1
If wrong please let me know how can I correct this.
Also if I want to delete duplicate record will it work?
delete from tbl_name
where row_id in
(select row_id from tbl_name group by name, city having count(*) > 1)
so i can make the above query like this
DELETE FROM tb_name where row_id not in(select min(row_id) from tb_name groupBy(name, city) having count(*)>1)
Your
DELETEsyntax is definitely totally wrong – that won’t work ever. What it’ll do is delete all rows that have more than one occurence – not leaving any data around…What you can do in SQL Server 2005 and up is use a CTE (Common Table Expression) and the
ROW_NUMBER()ranking function:You basically create “partitions” of your data by the
(name, city)combo – each of those pairs will get sequential numbers from 1 on up.Those that have more than one occurence will also have entries in that CTE with a
RowNum > 1– just delete all of those and your duplicates are done!Read about Using Common Table Expressions in SQL Server 2005 and about Ranking Functions and Performance in SQL Server 2005 (or consult the MSDN docs on those topics)