Please don’t ask me why but there is a lot of duplicate data where every field is duplicated.
For example
alex, 1
alex, 1
liza, 32
hary, 34
I will need to eliminate from this table one of the alex, 1 rows
I know this algorithm will be very ineffecient, but it does not matter. I will need to remove duplicate data.
What is the best way to do this? Please keep in mind I do not have 2 fields, I actually have about 10 fields to check on.
Method A. You can get a deduped version of your data using
for example, for your sample data,
yields
then simply delete the old table, and rename the new one. Of course, there are a number of fancy in-place solutions, but this is the clearest way to do it.
Method B. An in-place method is to create a primary key and delete duplicates that way. For example, you can
which makes Source look like this
then you can use
which will give the desired result. Of course, “NOT IN” is not exactly the most efficient, but it will do the job. Alternatively, you can LEFT JOIN the grouped table (maybe stored in a TEMP table), and do the DELETE that way.