Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6618123
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T20:47:20+00:00 2026-05-25T20:47:20+00:00

The problem We have a table of duplicate customer numbers: A varchar(16) NOT NULL,

  • 0

The problem

We have a table of duplicate customer numbers:

A varchar(16) NOT NULL,
B varchar(16) NOT NULL

These columns started off as Old and New (Delete and Retain), but devolved to where neither is preferred. The columns really are just “A” and “B” — two numbers for the same customer, in any order.

Furthermore, the table can have an arbitrary number of pairs for the same customer. You might see rows like

a,b
b,c

meaning a,b,c are all for the same customer. You might also see rows like

a,b
b,a
c,a

meaning a,b,c are all the same customer.

It’s not a clean acyclic representation like “old” and “new” values. The list of customer IDs for a customer is represented in this table in chunks of one or more rows, where the only connection is that the value for A or B column in one row might show up in the A or B column in some other row. My mission is to tie them all together into the list for each customer.

I want to convert this mess to something like

MasterKey int NOT NULL,
CustNum varchar(16) NOT NULL UNIQUE,
PRIMARY KEY( MasterKey, CustNum )

The one or more numbers for a customer would share the MasterKey in this table. As the UNIQUE constraint says, a given CustNum can’t appear more than once.

So for example, rows like this from the original

1a,1b
1b,1c
2a,2b
2b,2c
2d,2a
...

should end up as rows like this in the new table

1 1a
1 1b
1 1c
2 2a
2 2b
2 2c
2 2d
...

Edit: The values above are just to make the pattern clear. The actual customer number values are arbitrary varchars.

My attempted solutions

This feels like a job for recursion and therefore a CTE. But the potentially cyclic nature of the source data makes it hard for me to get the anchor case. I’ve tried to pre-clean it into more of an acyclic form, but I still can’t seem to get this right.

I’m also stubbornly trying to do this as a set-based SQL operation, instead of resorting to a cursor and loop. But maybe that’s not possible.

I’ve spent a good 8 hours pondering this and trying different approaches but it keeps slipping away. Any ideas or suggestions on the correct approach, or even some example code?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T20:47:21+00:00Added an answer on May 25, 2026 at 8:47 pm

    I’m going to do something I haven’t done before, and post an answer to
    my own question. I need to give huge thanks to both Beth and JBrooks
    for moving me in the right direction. I really wanted to solve this
    in a set-based, declarative way. And maybe that’s still possible using
    a CTE and recursion. But once I surrendered and said it’s OK for it to
    be imperative and iterative, it was much easier to do it.

    Anyway, given this target table from my question:

    CREATE TABLE UniqueCustomers
    (
        uid     int NOT NULL,
        gpid    varchar(16) NOT NULL UNIQUE, -- Important: UNIQUE to disallow duplicates
        PRIMARY KEY( uid, gpid ) -- Important: Disallow duplicates
    )
    

    I came up with the following stored procedure. It can be called when
    new dupes are reported, one by one. It can also be called in a loop
    over the legacy table that stores the dupes as pairs in a random
    order.

    CREATE PROCEDURE ReportDuplicateCustomerIDs
    (
        @id1 varchar(16),
        @id2 varchar(16)
    )
    AS
    BEGIN
        IF @id1 <> @id2
        BEGIN
            -- Retrieve the uid (if any) for each of the ids
            DECLARE @uid1 int
            SELECT @uid1 = NULL
            SELECT @uid1 = uid FROM UniqueCustomers WHERE gpid = @id1
    
            DECLARE @uid2 int
            SELECT @uid2 = NULL
            SELECT @uid2 = uid FROM UniqueCustomers WHERE gpid = @id2
    
            -- If we've seen NEITHER of the id's yet
            IF @uid1 IS NULL AND @uid2 IS NULL
            BEGIN
                -- Add both of them using a brand-new uid
                DECLARE @uidNew int
                SELECT @uidNew = Max(uid) + 1 FROM UniqueCustomers
                IF @uidNew IS NULL
                    SET @uidNew = 0
                INSERT INTO UniqueCustomers VALUES( @uidNew, @id1 )
                INSERT INTO UniqueCustomers VALUES( @uidNew, @id2 )
            END
            ELSE
            BEGIN
                -- If we've seen BOTH id's already
                IF @uid1 IS NOT NULL AND @uid2 IS NOT NULL
                BEGIN
                    -- If this pair bridges two existing chains.
                    IF @uid1 <> @uid2
                    BEGIN
                        -- Update everything using uid2 to use uid1 instead.
                        -- Consolidates the two dupe chains into one.
                        UPDATE UniqueCustomers SET uid = @uid1 WHERE uid = @uid2
                    END
                    -- ELSE nothing to do
                END
                ELSE
                    -- If we've seen only id1, then insert id2 using
                    -- the same uid that id1 is already using
                    IF @uid1 IS NOT NULL
                        INSERT INTO UniqueCustomers VALUES( @uid1, @id2 )
                    -- If we've seen only id2, then insert id1 using
                    -- the same uid that id2 is already using
                    ELSE -- @uid2 IS NOT NULL
                        INSERT INTO UniqueCustomers VALUES( @uid2, @id1 )
            END
        END
    END
    GO
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm new to wicket and stuck with the following problem: I have a table
everyone. I have problem replacing existing rows in table with new ones. I use
I have a problem regarding duplicate record.I have a table called tbl_Kisiler as shown
I'm stuck on a problem with sql. I have a table with many duplicate
I have a table where I have 6 columns. 5 of these columns I
I have problem with my application. I have table report, there are 2 column
I have the following problem. I have a table with a few hundred thousand
I have a fairly simple sync problem. I have a table with about 10
I have a complex sorting problem with my SQL statement. I have a table
Running into a problem. I have a table defined to hold the values of

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.