I’m trying to remove doublettes (sometimes triplettes, unfortunately!) from a MySQL table. My issue is that the only unique data available are the primary key, so in order to identify doublettes, you have to take account all the columns.
I’ve managed to identify all records that have doublettes and copied them along with their doublettes (including their primary keys) to the table temp. The source table is called translation and it has an integer primary key with the name TranslationID. How do I move on from here? Thanks!
edit Available columns are:
TranslationID
LanguageID
Translation
Etymology
Type
Source
Comments
WordID
Latest
DateCreated
AuthorID
Gender
Phonetic
NamespaceID
Index
EnforcedOwner
The duplicity issue resides with the rows with the Latest column assigned 1.
edit #2 Thank you, everyone for your time! I’ve solved the problem by using WouterH‘s answer, resulting in the following query:
DELETE from translation USING translation, translation as translationTemp
WHERE translation.Latest = 1
AND (NOT translation.TranslationID = translationTemp.TranslationID)
AND (translation.LanguageID = translationTemp.LanguageID)
AND (translation.Translation = translationTemp.Translation)
AND (translation.Etymology = translationTemp.Etymology)
AND (translation.Type = translationTemp.Type)
AND (translation.Source = translationTemp.Source)
AND (translation.Comments = translationTemp.Comments)
AND (translation.WordID = translationTemp.WordID)
AND (translation.Latest = translationTemp.Latest)
AND (translation.AuthorID = translationTemp.AuthorID)
AND (translation.NamespaceID = translationTemp.NamespaceID)
You can remove duplicates without temporary table or subquery. Delete all rows that have the same data but a different
TranslationID