I need to query statistics about duplicate values in my database table. For example, say that I have an e-mail field and multiple rows can have the same e-mail. I know want to know is how many addresses is repeated how many times. In other words: “908 emails are repeated 10 times, 1783 emails are repeated 9 times” and so on.
Repeated # of Emails
10 908
9 1783
I don’t need to see the actual e-mail addresses, just these statistics.
Right know I have this query which also retrieves the e-mail address:
select email_address,
count(email_address) as NumberOccurrences
from table_user_info
group by email_address
having ( count(email_address) > 1 )
How do I group these results?
An aggregate
COUNT()with a subquery also returning an aggregateCOUNT()will provide this. The subquery groups and counts per email address, as inabc@example.com - 10, and the outer query then counts and groups by the number of repeats returned by the subquery, discarding the actual email addresses.