I need to remove duplicate rows from a fairly large SQL Server table (i.e. 300,000+ rows).
The rows, of course, will not be perfect duplicates because of the existence of the RowID identity field.
MyTable
RowID int not null identity(1,1) primary key, Col1 varchar(20) not null, Col2 varchar(2048) not null, Col3 tinyint not null
How can I do this?
Assuming no nulls, you
GROUP BYthe unique columns, andSELECTtheMIN (or MAX)RowId as the row to keep. Then, just delete everything that didn’t have a row id:In case you have a GUID instead of an integer, you can replace
with