Consider the following “tweets” table
tweet_id user_id text
-----------------------------------------
1 1 look at my tweet
2 1 look at my tweet
3 1 a very different tweet
4 1 look at my tweet
5 1 look at my tweets
6 2 a cool tweet
7 2 this is my tweet
8 1 hello
9 1 hello
For each user, I want to count the number of duplicate tweets. In the example above, user_id 1 has a total of 5 tweets, of which 2 are unique (tweet_id 3 and 5) and 3 are duplicate (1, 2 and 4). So the outcome of the query for user 1 should be “3”.
[EDIT]
Look at user_id 1. The tweet “look at my tweet” appears 3 times, the tweet “hello” 2 times. The total number of duplicate tweets is then 3 + 2 = 5.
For the first part, you can use the following query
The inner query finds all users and tweets that have occured more than once. The outer query adds up the duplicate values for each user