Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 9173631
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 17, 20262026-06-17T16:37:06+00:00 2026-06-17T16:37:06+00:00

Possible Duplicate: How can I find duplicate entries and delete the oldest ones in

  • 0

Possible Duplicate:
How can I find duplicate entries and delete the oldest ones in SQL?

I have a database which has a few thousand duplicates due to a faulty update tool. I am able to identify the collections of items with duplicates, but need to delete only the oldest entries, not necessarily the lowest id. Test data looks like this, correct row has an *

The articles with duplicate titles which do not have duplicate ruleids should be deleted except for the most recently created rows. (actual id column is a GUID so I cannot assume auto-increment)

Id           Article id          Rule Id         Title          Opened Date
--           ----------          -------         -----          -----------
1*           111                 5               T1             2013-01-20
2            112                 5               T1             2013-07-01
3*           113                 6               T2             2013-07-01
4*           114                 7               T2             2013-07-02
5            115                 8               T3             2012-07-01
6            116                 8               T3             2013-01-20
7*           117                 8               T3             2013-01-21           

Table Schema:

CREATE TABLE [dbo].[test_ai](
    [id] [int] NOT NULL,
    [ArticleId] [varchar](50) NOT NULL,
    [ruleid] [varchar](50) NULL,
    [Title] [nvarchar](max) NULL,
    [AuditData_WhenCreated] [datetime] NULL,
PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)

Test Data Inserts

insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (1, 111, 5, 'test 1', '2013-01-20')
insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (2, 112, 5, 'test 1', '2012-07-01')
insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (3, 113, 6, 'test 2', '2012-07-01')
insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (4, 114, 7, 'test 2', '2012-07-02')
insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (5, 115, 8, 'test 3', '2012-07-01')
insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (6, 116, 8, 'test 3', '2013-01-20')
insert into test_ai (id, articleid, ruleid, title, auditdata_whencreated) values (7, 117, 8, 'test 3', '2013-01-21')

My current query looks like this

select * from test_ai
where test_ai.id in

-- set 1 - all rows with duplicates
(select f.id 
from test_ai as F 
WHERE exists (select ruleid, title, count(id)   
FROM test_ai
    WHERE test_ai.title = F.title
        AND test_ai.ruleid = F.ruleid
    GROUP BY test_ai.title, test_ai.ruleid
    having count(test_ai.id) > 1))
    and test_ai.id not in

-- set 2 - includes one row from each set of duplicates
(select min(id)
from test_ai as F
WHERE EXISTS (select ruleid, title, count(id)
from test_ai
WHERE test_ai.title = F.title 
    AND test_ai.ruleid = F.ruleid
group by test_ai.title, test_ai.ruleid
HAVING count(test_ai.id) > 1)   
GROUP BY title, ruleid
)   

This SQL identifies some of the rows that should be deleted (rows 2,6,7), but it does choose the oldest article by ‘opened date.’ (should delete rows 2,5,6) I realize I am not specifying this to the statement, but am struggling with how to add this last piece. If it results in a script that I need to run more than once to delete duplicates when there are more than single duplicates, that is not a problem.

The actual problem is significantly more complicated, but if I can get past this one blocking part, I’ll be able to move forward again. Thanks for taking a look!

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-17T16:37:06+00:00Added an answer on June 17, 2026 at 4:37 pm

    The typical model for deleting one row from a set (or from each group in a set) in SQL Server 2005+ is:

    ;WITH cte AS 
    (
      SELECT col, rn = ROW_NUMBER() OVER 
        (PARTITION BY something ORDER BY something)
      FROM dbo.base_table
      WHERE ...
    )
    DELETE x WHERE rn = 1;
    

    In your case this would be:

    ;WITH cte AS 
    (
      SELECT id, ruleid, Title, rn = ROW_NUMBER() OVER 
      (
         PARTITION BY ruleid, Title  
         ORDER BY auditdata_whencreated DESC
      )
      FROM dbo.test_ai
    )
    DELETE cte 
      OUTPUT deleted.id
      WHERE rn > 1;
    

    Results:

    id
    ----
    2
    6
    5
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Possible Duplicate: Can you define “literal” tables in SQL? Occasionally I find myself in
Possible Duplicate: Where can I find facial detection software, algorithms, etc? Does anyone have
Possible Duplicate: How can I find out what Page has installed my Facebook Canvas
Possible Duplicate: Operator[][] overload I've looked on the internet but can't find a definitive
Possible Duplicate: Best way to find Browser type & version? In HTML/JavaScript, How can
Possible Duplicate: Can I scroll a ScrollView programmatically in Android? I have a chat
Possible Duplicate: Can Read-Only Properties be Implemented in Pure JavaScript? I have an Object
Possible Duplicate: How can I find the method that called the current method? I'd
Possible Duplicate: Where can I find a Java decompiler? Hi, how to get the
Possible Duplicate: C++ delete - It deletes my objects but I can still access

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.