Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6721547
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 26, 20262026-05-26T09:20:34+00:00 2026-05-26T09:20:34+00:00

A previous DBA managed a non relational table with 2.4M entries, all with unique

  • 0

A previous DBA managed a non relational table with 2.4M entries, all with unique ID’s. However, there are duplicate records with different data in each record for example:

+---------+---------+--------------+----------------------+-------------+
| id      | Name    | Address      | Phone   | Email      | LastVisited |
+---------+---------+--------------+---------+------------+-------------+
| 1       | bob     | 12 Some Road | 02456   |            |             | 
| 2       | bobby   |              | 02456   | bob@domain |             |
| 3       | bob     | 12 Some Rd   | 02456   |            | 2010-07-13  | 
| 4       | sir bob |              | 02456   |            |             |
| 5       | bob     | 12SomeRoad   | 02456   |            |             |
| 6       | mr bob  |              | 02456   |            |             |
| 7       | robert  |              | 02456   |            |             |
+---------+---------+--------------+---------+------------+-------------+

This isnt the exact table – the real table has 32 columns – this is just to illustrate

I know how to identify the duplicates, in this case i’m using the phone number. I’ve extracted the duplicates into a seperate table – there’s 730k entires in total.

What would be the most efficient way of merging these records (and flagging the un-needed records for deletion)?

I’ve looked at using UPDATE with INNER JOIN’s, but there are several WHERE clauses needed, because i want to update the first record with data from subsequent records, where that subsequent record has additional data the former record does not.

I’ve looked at third party software such as Fuzzy Dups, but i’d like a pure MySQL option if possible

The end goal then is that i’d be left with something like:

+---------+---------+--------------+----------------------+-------------+
| id      | Name    | Address      | Phone   | Email      | LastVisited |
+---------+---------+--------------+---------+------------+-------------+
| 1       | bob     | 12 Some Road | 02456   | bob@domain | 2010-07-13  | 
+---------+---------+--------------+---------+------------+-------------+

Should i be looking at looping in a stored procedure / function or is there some real easy thing i’ve missed?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-26T09:20:35+00:00Added an answer on May 26, 2026 at 9:20 am

    U have to create a PROCEDURE, but before that
    create ur own temp_table like :

    Insert into temp_table(column1, column2,....) values (select column1, column2... from myTable GROUP BY phoneNumber) 
    

    U have to create the above mentioned physical table so that u can run a cursor on it.

    create PROCEDURE myPROC
    {

    create a cursor on temp::
    fetch the phoneNumber and id of the current row from the temp_table to the local variable(L_id, L_phoneNum).
    

    And here too u need to create a new similar_tempTable which will contain the values as

    Insert into similar_tempTable(column1, column2,....) values (Select column1, column2,.... from myTable where phoneNumber=L_phoneNumber)
    

    The next step is to extract the values of each column u want from similar_tempTable and update into the the row of myTable where id=L_id and delete the rest duplicate rows from myTable.

    And one more thing, truncate the similar_tempTable after every iteration of the cursor…

    Hope this will help u…

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

In previous releases there were 3 ways to pass data from controller to view
[Previous essay-title for question] Oracle SQL: update parent table column if all child table
Continuing from my previous question , is there a comprehensive document that lists all
In previous versions of jQuery tabs there was an option to automatically set the
From previous post, I learnt that for there are two ways, at least, to
There is previous little on the google on this subject other than people asking
After reading previous questions about this error, it seems like all of them conclude
Previous issue - was not able to store non-english characters: How to store non-english
MY previous site used DATETIME fileds in MySQL to store all dates/times. On my
The previous version had a 'titleShow': false, setting, however it seems to have been

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.