Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7993717
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T13:56:08+00:00 2026-06-04T13:56:08+00:00

I am working on a system where database records are periodically created based on

  • 0

I am working on a system where database records are periodically created based on an input stream of data. Occasionally some input comes along that provides evidence that two independently created records should be merged into one. I am looking for recommendations on ways to effect the merge in the database.

The main table (which is merely a design at this point) contains records consisting of a unique ID (call it the main ID, which is assigned by the database, MySQL in my system), and some data fields. There are also some other tables that use the main ID to link their records to a record in the main table.

MainTable:
int   mainID
blob  data
...

OtherTable:
int   otherID
int   mainID
blob  otherData
...

Now if each record has never been shared to any external process or system, it is straightforward to somehow blend the data fields from one record into another and delete the record for the one. It is also straightforward (if tedious and/or inefficient) to update the main ID fields in the other tables to the main ID value we are keeping.

Things get complicated when the ID for each record has been shared outside the system. In this case, I think it should be unreasonable to have queries with those deleted IDs simply fail, though I could be convinced otherwise.

An idea I am considering is to introduce a merge table with two key fields: and original main ID and a current main ID. Its purpose is to alias one main ID to another. As each main table record is created, we add a record to the merge table mapping the main ID of the newly created main table record to itself. If a merge occurs, we simply update the current main ID field in the merge table for the record with the original main ID for the main record that is being merged away. Then, for every query based on a main ID, we map that ID through the merge table to find the effective main ID we should really use.

MergeTable:
int   mergeID
int   originalMainID
int   currentMainID

Is this a good technique? Can the mapping be done seamlessly in SQL queries? Are there standard or better techniques I should be considering instead?

In doing research on this topic I found surprisingly few examples of this. This question is close, but the merge scenario is different from mine, or so it seems to me. I know a bit about databases, but am by no means an expert, so I probably don’t know the right terms to search for.

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T13:56:10+00:00Added an answer on June 4, 2026 at 1:56 pm

    I like your design idea, but consider one where you store only replaced records in your merge table, not all of them. This reduces storage and improves speed, given the following query:

    SELECT *
      FROM MainTable
      WHERE mainID = 1
    UNION ALL
    SELECT MainTable.*
      FROM MergeTable
      INNER JOIN MainTable
        ON MainTable.mainID = MergeTable.currentMainID
      WHERE MergeTable.originalMainID = 1
    LIMIT 1
    

    The idea is that in most cases, the first query will succeed and return a result, and MySQL will abort the second query since the LIMIT is fulfilled. If the first query returns no results, then it will proceed to the second query and perform the join on the merge table to see if it’s been merged.

    According to MySQL, regarding LIMIT:

    As soon as MySQL has sent the required number of rows to the client,
    it aborts the query unless you are using SQL_CALC_FOUND_ROWS.

    If merged records are the exception, not the rule, then this will save many, many joins.

    You could also do this with two queries if the UNION query is too scary. You could simply check to see if the record exists, and if not, then check the merge table.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm working on a system that performs matching on large sets of records based
I'm working on a project that needs to authenticate users based on records in
I am currently working on a data management system that needs to calculate huge
I am working with database data that manipulates college students exam results. Basically, I
I'm working in C# with a textbox that acts as input for a database
I've been working on a survey system that reads the records(questions sorted by chapters
I'm working with a system that consists of several applications and services, almost all
I'm working on a system that relies in $_SERVER['REMOTE_ADDR'] to get the user address
I am working on a system that performs continuous integration and I am looking
My Case: I'm working on a system that will need to create various X12

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.