Okay, I have a table with some junk data and no unique identifier column.

Question

0

Asked: June 17, 20262026-06-17T14:22:10+00:00 2026-06-17T14:22:10+00:00

Okay, I have a table with some junk data and no unique identifier column.

0

Okay, I have a table with some junk data and no unique identifier column. Let me give you an example of the table I’m working with:

     A    |   B   |  C   |        D         |   E  |
  --------------------------------------------------
1.  Fiona | Smith | NULL | 2152 Cherry Lane | CA   |
2.  Fiona | Smith | NULL | NULL             | NULL |
3.  Bill  | NULL  | ACME | 2903 Center Road | WA   |
4.  Bill  | NULL  | ACME | NULL             | NULL |
5.  NULL  | NULL  | ABC  | 2300 Water St    | PA   |
6.  NULL  | NULL  | ABC  | 2300 Water St    | PA   |
7.  NULL  | NULL  | NULL | 3455 B Street    | CO   |

I need to write a SELECT statement that grabs only distinct rows. For example, take rows 1 and 2. They both obviously refer to the same person, but they’re only partially duplicate. Out of those two, I want row 1 included in my SELECT statement because it contains the most data in each column. Same goes for rows 3 and 4. Row 3 is the one I want included. For rows 5 and 6, it does not matter which one is selected since they both are exact duplicates. Row 7 would be included by default since it is distinct (meaning A, B and C, not just A and B).

Here’s what I have tried:

SELECT A, B, C = MAX(D), MAX(E), 
FROM dbo.Data
GROUP BY A, B, C;

This seems to grab the unique rows I want, but the data is somehow placed into the wrong columns.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T14:22:11+00:00

This approach treats D and E as equal:

DECLARE @x TABLE
(
  A VARCHAR(32), 
  B VARCHAR(32), 
  C VARCHAR(32), 
  D VARCHAR(32), 
  E VARCHAR(32)
);

INSERT @x VALUES
('Fiona', 'Smith', NULL,   '2152 Cherry Lane',  'CA'),
('Fiona', 'Smith', NULL,   NULL,                NULL),
('Bill',  NULL,    'ACME', '2903 Center Road',  'WA'),
('Bill',  NULL,    'ACME', NULL,                NULL),
(NULL  ,  NULL,    'ABC',  '2300 Water St',     'PA'),
(NULL  ,  NULL,    'ABC',  '2300 Water St',     'PA'),
(NULL  ,  NULL,    NULL,   '3455 B Street',     'CO'),
('Bob',   'Barker',NULL,   NULL,                NULL),
('Bob',   'Barker',NULL,   NULL,                'NY');

;WITH x AS
(
  SELECT A,B,C,D,E, rn = ROW_NUMBER() OVER 
  (
    PARTITION BY A,B,C
    ORDER BY COALESCE(LEN(LEFT(D,1)),0) + COALESCE(LEN(LEFT(E,1)),0) DESC
  )
  FROM @x
)
SELECT A,B,C,D,E
FROM x WHERE rn = 1;

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Okay, I have a table with some junk data and no unique identifier column.

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply