basically i’d like to do:
SELECT * FROM `table`
WHERE ( `col1`, `col2`)
IN
[
SELECT `col1`, `col2`
FROM `table`
GROUP BY `col1`, `col2`
HAVING count(*) >1
]
i’d like this to select all unique duplicates and their actual duplicates.
but how can i keep the relationship between col1 and col2 relevant to the IN query?
i know there are other ways to do this.
one method is building a dummy table, moving all the relevant entries over to it, then replacing the original.
the other uses a join like:
SELECT *
FROM table t1
JOIN table t2
ON t1.id > t2.id
AND t1.col1 = t2.col1
AND t1.col2 = t2.col2;
but that takes about 10 minutes to complete : \
This will show all duplicates, sorted together:
An index on
(col1, col2)would help the above – and also your Join version.