I’m looking for a schema-independent query. That is, if I have a users table or a purchases table, the query should be equally capable of catching duplicate rows in either table without any modification (other than the from clause, of course).
I’m using T-SQL, but I’m guessing there should be a general solution.
I believe that this should work for you. Keep in mind that CHECKSUM() isn’t 100% perfect – it’s theoretically possible to get a false positive here (I think), but otherwise you can just change the table name and this should work:
The
ROW_NUMBER()is needed so that you have some way of distinguishing rows. It requires anORDER BYand that can’t be a constant, soGETDATE()was my workaround for that.Simply change the table name in the CTE and it should work without spelling out the columns.