I am trying to find duplicate rows in my DB, like this:
SELECT email, COUNT(emailid) AS NumOccurrences
FROM users
GROUP BY emailid HAVING ( COUNT(emailid) > 1 )
This returns the emailid and the number of matches found. Now what I want do is compare the ID column to another table I have and set a column there with the count.
The other table has a column named duplicates, which should contain the amount of duplicates from the select. So let’s say we have 3 rows with the same emailid. The duplicates column has a “3” in all 3 rows. What I want is a “2” in the first 2 and nothing or 0 in the last of the 3 matching ID rows.
Is this possible?
Update:
I managed to have a temporary table now, which looks like this:
mailid | rowcount | AmountOfDups
643921 | 1 | 3
643921 | 2 | 3
643921 | 3 | 3
Now, how could I decide that only the first 2 should be updated (by mailid) in the other table? The other table has mailid as well.
…is a great starting point for such a problem. Never underestimate the power of ROW_NUMBER()!