I’m using SQL Server 2008. I have a table Customers customer_number int field1 varchar

Question

0

Asked: May 15, 20262026-05-15T05:36:43+00:00 2026-05-15T05:36:43+00:00

I’m using SQL Server 2008. I have a table Customers customer_number int field1 varchar

0

I’m using SQL Server 2008. I have a table

Customers

customer_number int

field1 varchar

field2 varchar

field3 varchar

field4 varchar

… and a lot more columns, that don’t matter for my queries.

Column customer_number is pk. I’m trying to find duplicate values and some differences between them.

Please, help me to find all rows that have same

1) field1, field2, field3, field4

2) only 3 columns are equal and one of them isn’t (except rows from list 1)

3) only 2 columns equal and two of them aren’t (except rows from list 1 and list 2)

In the end, I’ll have 3 tables with this results and additional groupId, which will be same for a group of similar (For example, for 3 column equals, rows that have 3 same columns equal will be a separate group)

Thank you.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T05:36:43+00:00

The easiest would probably be to write a stored procedure to iterate over each group of customers with duplicates and insert the matching ones per group number respectively.

However, I’ve thought about it and you can probably do this with a subquery. Hopefully I haven’t made it more complicated than it ought to, but this should get you what you’re looking for for the first table of duplicates (all four fields). Note that this is untested, so it might need a little tweaking.

Basically, it gets each group of fields where there are duplicates, a group number for each, then gets all customers with those fields and assigns the same group number.

INSERT INTO FourFieldsDuplicates(group_no, customer_no)
SELECT Groups.group_no, custs.customer_no
FROM (SELECT ROW_NUMBER() OVER(ORDER BY c.field1) AS group_no,
             c.field1, c.field2, c.field3, c.field4
      FROM Customers c
      GROUP BY c.field1, c.field2, c.field3, c.field4
      HAVING COUNT(*) > 1) Groups
INNER JOIN Customers custs ON custs.field1 = Groups.field1
                           AND custs.field2 = Groups.field2
                           AND custs.field3 = Groups.field3
                           AND custs.field4 = Groups.field4

The other ones are a bit more complicated, however as you’ll need to expand out the possibilities. The three-field groups would then be:

INSERT INTO ThreeFieldsDuplicates(group_no, customer_no)
SELECT Groups.group_no, custs.customer_no
FROM (SELECT ROW_NUMBER() OVER(ORDER BY GroupsInner.field1) AS group_no,
             GroupsInner.field1, GroupsInner.field2, 
             GroupsInner.field3, GroupsInner.field4
      FROM (SELECT c.field1, c.field2, c.field3, NULL AS field4
            FROM Customers c
            WHERE NOT EXISTS(SELECT d.customer_no
                       FROM FourFieldsDuplicates d
                       WHERE d.customer_no = c.customer_no)
            GROUP BY c.field1, c.field2, c.field3
            UNION ALL
            SELECT c.field1, c.field2, NULL AS field3, c.field4
            FROM Customers c
            WHERE NOT EXISTS(SELECT d.customer_no
                             FROM FourFieldsDuplicates d
                             WHERE d.customer_no = c.customer_no)
            GROUP BY c.field1, c.field2, c.field4
            UNION ALL
            SELECT c.field1, NULL AS field2, c.field3, c.field4
            FROM Customers c
            WHERE NOT EXISTS(SELECT d.customer_no
                             FROM FourFieldsDuplicates d
                             WHERE d.customer_no = c.customer_no)
            GROUP BY c.field1, c.field3, c.field4
            UNION ALL
            SELECT NULL AS field1, c.field2, c.field3, c.field4
            FROM Customers c
            WHERE NOT EXISTS(SELECT d.customer_no
                             FROM FourFieldsDuplicates d
                             WHERE d.customer_no = c.customer_no)
            GROUP BY c.field2, c.field3, c.field4) GroupsInner
      GROUP BY GroupsInner.field1, GroupsInner.field2, 
               GroupsInner.field3, GroupsInner.field4
      HAVING COUNT(*) > 1) Groups
INNER JOIN Customers custs ON (Groups.field1 IS NULL OR custs.field1 = Groups.field1)
                           AND (Groups.field2 IS NULL OR custs.field2 = Groups.field2)
                           AND (Groups.field3 IS NULL OR custs.field3 = Groups.field3)
                           AND (Groups.field4 IS NULL OR custs.field4 = Groups.field4)

Hopefully this produces the right results and I’ll leave the last one as an exercise. 😀

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using SQL Server 2008. I have a table Customers customer_number int field1 varchar

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply