Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8827421
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 14, 20262026-06-14T07:22:02+00:00 2026-06-14T07:22:02+00:00

i have a table eng-jap which is essentially just a translation so having an

  • 0

i have a table eng-jap which is essentially just a translation so having an english and a japanese column. a script i made somehow cause every insert to have a clone and thus 1000s of duplicate entries in this table, for example:

duplicate example A

eng                        jap
"mother washes every day"  "母は毎日洗濯する"
"mother washes every day"  "母は毎日洗濯する"

if it were just one column i could use the query:

SELECT eng, COUNT(*) c FROM `eng-jap` GROUP BY eng HAVING c > 1

but since the table can legitimately have a duplicates in eng or jap, as long as its not in both. for example:

duplicate example B

eng                        jap
"mother washes every day"  "母は毎日洗濯する"
"every day mother washes"  "母は毎日洗濯する"

this is to allow one sentence to have more than one translation. so i need to alter the query to find duplicates as a combination of both columns i guess you could say.

once again to be clear. example B is fine, i want to select all duplicates like example A so i can make a scrip to remove one of all of the duplicates. please and Thank you!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-14T07:22:04+00:00Added an answer on June 14, 2026 at 7:22 am

    I think you just need to group by eng and jap:

    SELECT eng, jap, COUNT(*) c FROM `eng-jap` GROUP BY eng, jap HAVING c > 1
    

    And if you want to remove all duplicates, if your rows have an id, this query shows all the ids that you have to keep:

    select
      SUBSTRING_INDEX(GROUP_CONCAT(CAST(id AS CHAR) order by id), ',', 1) as id
    from `eng-jap`
    group by eng, jap
    

    (it’s a trick that uses GROUP_CONCAT to find the first id of every combination of eng/jap). And this query shows the ids of the rows you have to delete:

    select id
    from
      `eng-jap`
         left join
      (select
         SUBSTRING_INDEX(GROUP_CONCAT(CAST(id AS CHAR) order by id), ',', 1) as id
         from `eng-jap`
         group by eng, jap) `eng-jap-dup`
      on `eng-jap`.id = `eng-jap-dup`.id
    where `eng-jap-dup`.id is null
    

    I rewrote this query using just join, it has to be a little faster, but if your table is too big it is probably still slow.

    If it is still too slow and it still doesn’t work, i would suggest you to add two more columns to your table:

    • eng-hash, where you can save MD5(eng)
    • jap-hash, where you can save MD5(jap)

    then update all of your records like this:

    update `eng-jap` set `eng-jap`.`eng-hash` = MD5(eng), `eng-jap`.`jap-hash` = MD5(jap)
    

    then you can add a unique index on the table on both columns, ignore all errors, and let MySql do the work to eliminate duplicates for you:

    alter ignore table `eng-jap` add unique index (eng-hash, jap-hash);
    

    (if you get an error while creating index, see this question: MySQL: ALTER IGNORE TABLE gives "Integrity constraint violation")

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have table and this table contain result column with some entries. I just
I have table in which I am inserting rows for employee but next time
I have table with column Percentage varchar(10) Data in that table is Pecentage 2/10
I have a table called Classes which stores information on College classes. It has
I have ISO 639-2 language codes (eng, fre, hin, etc.) in english as primary
I have table in my database which has fields of ID,NAME,CONTEXT. I am showing
I have this table: beneficiary service marks term 1 eng 50 1 1 eng
I have created a new column in my table(table1) . I am trying to
I have table of over 4 million rows and accidentally in one column there
I have table defined with FlexiGrid. Call to all variables is ok. One column

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.