First I would like to remark that I (as a Newb) did search through

Question

0

Asked: June 9, 20262026-06-09T15:26:11+00:00 2026-06-09T15:26:11+00:00

First I would like to remark that I (as a Newb) did search through

0

First I would like to remark that I (as a Newb) did search through several Q & A regarding duplicates in a table though unfortunately for me, I couldn’t manipulate the code being used as answer.

My table is made out of a report being sorted in SQL Server 2008.

I would like to know how do I remove duplicate records and with an explanation.

"MyTable":

Column1   (PK-auto incremental table's record ID) 
Column2   (some TXT) 
Column3   (Some TXT)
Column4   (SmallDateTime)
Column5   is empty

Column5 will hold the value of SUM(count of deleted duplicates including this survived row)

The key to the solution in may case is if [column2 and column3] have multiple records with same content (hence Duplicates) only they don’t always share the same date (column4).

From this:

col1  col2   col3  col4         col5
----  -----  ----  -----------  ----
1     [abc]  [4]   [10/1/2012]  null
2     [abc]  [1]   [12/1/2012]  null
3     [ghi]  [6]   [4/1/2012]   null
4     [def]  [5]   [8/1/2012]   null
5     [abc]  [4]   [10/1/2012]  null
6     [def]  [5]   [12/1/2012]  null
7     [ghi]  [6]   [15/1/2012]  null
8     [abc]  [4]   [17/1/2012]  null
9     [ghi]  [6]   [6/1/2012]   null
10    [abc]  [1]   [13/1/2012]  null

Into this:

col1  col2   col3  col4         col5
----  -----  ----  -----------  ----
8     [abc]  [4]   [17/1/2012]  2
10    [abc]  [1]   [13/1/2012]  3
6     [def]  [5]   [12/1/2012]  2
7     [ghi]  [6]   [15/1/2012]  3

Meaning leave the latest (1) as a representation of every duplicated record.

++ReEditing++

Aaron Bertrand
shawnt00
e2nburner… and rest of u all
i can’t say how much i thank your Reply although i did not yet comprehend that mass of code.
i am now going to check those codes but not b4 thanking you guys !!

when i first started to program and needed sql querys, after using

Select * From MyTable

… my 1’st SQL Statement …

i said HEY i know SQL !!! …. Now … look at that deep knowledge of you guys … THANKS A LOT i know that this post in StackOverFlow will be further useful for other beginners too

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-09T15:26:13+00:00

This answer uses a common table expression to apply row_number() and count() to each “slice” of data (meaning grouped by col2 + col3). The count() is used to identify how many rows belong to each such group, and the row_number() is used to apply a “rank” ordered by col4 desc (1 = latest per group, 2 = 2nd latest, etc). This also uses col1 (which looks like a unique column) to break any ties. The CTE can be followed by a query such as a select, update, delete, etc. So you can run the first select to validate that these are the rows you want to keep, and that the counts are correct. If they are, then you can proceed with the updates and deletes. You’ll notice that in all cases the row_number() output is used to identify the rows you keep or the rows you discard.

To identify the rows you want to keep:

;WITH n AS 
(
  SELECT col1, col2, col3, col4, 
    c = COUNT(*) OVER (PARTITION BY col2, col3),
    rn = ROW_NUMBER() OVER 
    (
      PARTITION BY col2, col3 ORDER BY col4 DESC, col1 DESC
    )
  FROM dbo.table_name
)
SELECT col1, col2, col3, col4, c
  FROM n WHERE rn = 1;

Once you’ve confirmed that those are the row you want to keep, you can update them like this:

;WITH n AS 
(
  SELECT col1, col2, col3, col4, col5, 
    c = COUNT(*) OVER (PARTITION BY col2, col3),
    rn = ROW_NUMBER() OVER 
    (
      PARTITION BY col2, col3 ORDER BY col4 DESC, col1 DESC
    )
  FROM dbo.table_name
)
UPDATE n SET col5 = c
  WHERE rn = 1;

Then delete the remainders this way:

;WITH n AS 
(
  SELECT col1, col2, col3, col4, 
    rn = ROW_NUMBER() OVER 
    (
      PARTITION BY col2, col3 ORDER BY col4 DESC, col1 DESC
    )
  FROM dbo.table_name
)
DELETE n WHERE rn > 1;

Or even more simply (assuming col5 was completely null before the update):

DELETE dbo.table_name WHERE col5 IS NULL;

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

First I would like to remark that I (as a Newb) did search through

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply