Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8221475
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 7, 20262026-06-07T13:56:11+00:00 2026-06-07T13:56:11+00:00

Here want to delete rows with a duplicated column’s value ( Product ) which

  • 0

Here want to delete rows with a duplicated column’s value (Product) which will be then used as a primary key.

The column is of type nvarchar and we don’t want to have 2 rows for one product.
The database is a large one with about thousands rows we need to remove.

During the query for all the duplicates, we want to keep the first item and remove the second one as the duplicate.

There is no primary key yet, and we want to make it after this activity of removing duplicates.
Then the Product columm could be our primary key.

The database is SQL Server CE.

I tried several methods, and mostly getting error similar to :

There was an error parsing the query. [ Token line number = 2,Token line offset = 1,Token in error = FROM ]

A method which I tried :

DELETE FROM TblProducts
FROM TblProducts w
    INNER JOIN (
            SELECT Product
            FROM TblProducts
            GROUP BY Product
            HAVING COUNT(*) > 1
            )Dup ON w.Product = Dup.Product

The preferred way trying to learn and adjust my code with something similar
(It’s not correct yet):

SELECT Product, COUNT(*) TotalCount
FROM TblProducts
GROUP BY Product
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC

--
;WITH cte   -- These 3 lines are the lines I have more doubt on them
     AS (SELECT ROW_NUMBER() OVER (PARTITION BY Product
                                       ORDER BY ( SELECT 0)) RN
         FROM   Word)
DELETE FROM cte
WHERE  RN > 1
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-07T13:56:13+00:00Added an answer on June 7, 2026 at 1:56 pm

    If you have two DIFFERENT records with the same Product column, then you can SELECT the unwanted records with some criterion, e.g.

     CREATE TABLE victims AS
         SELECT MAX(entryDate) AS date, Product, COUNT(*) AS dups FROM ProductsTable WHERE ...
         GROUP BY Product HAVING dups > 1;
    

    Then you can do a DELETE JOIN between ProductTable and Victims.

    Or also you can select Product only, and then do a DELETE for some other JOIN condition, for example having an invalid CustomerId, or EntryDate NULL, or anything else. This works if you know that there is one and only one valid copy of Product, and all the others are recognizable by the invalid data.

    Suppose you instead have IDENTICAL records (or you have both identical and non-identical, or you may have several dupes for some product and you don’t know which). You run exactly the same query. Then, you run a SELECT query on ProductsTable and SELECT DISTINCT all products matching the product codes to be deduped, grouping by Product, and choosing a suitable aggregate function for all fields (if identical, any aggregate should do. Otherwise I usually try for MAX or MIN). This will “save” exactly one row for each product.

    At that point you run the DELETE JOIN and kill all the duplicated products. Then, simply reimport the saved and deduped subset into the main table.

    Of course, between the DELETE JOIN and the INSERT SELECT, you will have the DB in a unstable state, with all products with at least one duplicate simply disappeared.

    Another way which should work in MySQL:

    -- Create an empty table
    CREATE TABLE deduped AS SELECT * FROM ProductsTable WHERE false;
    
    CREATE UNIQUE INDEX deduped_ndx ON deduped(Product);
    
    -- DROP duplicate rows, Joe the Butcher's way
    INSERT IGNORE INTO deduped SELECT * FROM ProductsTable;
    
    ALTER TABLE ProductsTable RENAME TO ProductsBackup;
    
    ALTER TABLE deduped RENAME TO ProductsTable;
    -- TODO: Copy all indexes from ProductsTable on deduped.
    

    NOTE: the way above DOES NOT WORK if you want to distinguish “good records” and “invalid duplicates”. It only works if you have redundant DUPLICATE records, or if you do not care which row you keep and which you throw away!

    EDIT:
    You say that “duplicates” have invalid fields. In that case you can modify the above with a sorting trick:

      SELECT * FROM ProductsTable ORDER BY Product, FieldWhichShouldNotBeNULL IS NULL;
    

    Then if you have only one row for product, all well and good, it will get selected. If you have more, the one for which (FieldWhichShouldNeverBeNull IS NULL) is FALSE (i.e. the one where the FieldWhichShouldNeverBeNull is actually not null as it should) will be selected first, and inserted. All others will bounce, silently due to the IGNORE clause, against the uniqueness of Product. Not a really pretty way to do it (and check I didn’t mix true with false in my clause!), but it ought to work.

    EDIT
    actually more of a new answer

    This is a simple table to illustrate the problem

    CREATE TABLE ProductTable ( Product varchar(10), Description varchar(10) );
    INSERT INTO ProductTable VALUES ( 'CBPD10', 'C-Beam Prj' );
    INSERT INTO ProductTable VALUES ( 'CBPD11', 'C Proj Mk2' );
    INSERT INTO ProductTable VALUES ( 'CBPD12', 'C Proj Mk3' );
    

    There is no index yet, and no primary key. We could still declare Product to be primary key.

    But something bad happens. Two new records get in, and both have NULL description.

    Yet, the second one is a valid product since we knew nothing of CBPD14 before now, and therefore we do NOT want to lose this record completely. We do want to get rid of the spurious CBPD10 though.

    INSERT INTO ProductTable VALUES ( 'CBPD10', NULL );
    INSERT INTO ProductTable VALUES ( 'CBPD14', NULL );
    

    A rude DELETE FROM ProductTable WHERE Description IS NULL is out of the question, it would kill CBPD14 which isn’t a duplicate.

    So we do it like this. First get the list of duplicates:

    SELECT Product, COUNT(*) AS Dups FROM ProductTable GROUP BY Product HAVING Dups > 1;
    

    We assume that: “There is at least one good record for every set of bad records”.

    We check this assumption by positing the opposite and querying for it. If all is copacetic we expect this query to return nothing.

    SELECT Dups.Product FROM ProductTable
    RIGHT JOIN ( SELECT Product, COUNT(*) AS Dups FROM ProductTable GROUP BY Product HAVING Dups > 1 ) AS Dups
    ON (ProductTable.Product = Dups.Product
            AND ProductTable.Description IS NOT NULL)
    WHERE ProductTable.Description IS NULL;
    

    To further verify, I insert two records that represent this mode of failure; now I do expect the query above to return the new code.

    INSERT INTO ProductTable VALUES ( "AC5", NULL ), ( "AC5", NULL );
    

    Now the “check” query indeed returns,

    AC5
    

    So, the generation of Dups looks good.

    I proceed now to delete all duplicate records that are not valid. If there are duplicate, valid records, they will stay duplicate unless some condition may be found, distinguishing among them one “good” record and declaring all others “invalid” (maybe repeating the procedure with a different field than Description).

    But ay, there’s a rub. Currently, you cannot delete from a table and select from the same table in a subquery ( http://dev.mysql.com/doc/refman/5.0/en/delete.html ). So a little workaround is needed:

    CREATE TEMPORARY TABLE Dups AS
         SELECT Product, COUNT(*) AS Duplicates
             FROM ProductTable GROUP BY Product HAVING Duplicates > 1;
    
    DELETE ProductTable FROM ProductTable JOIN Dups USING (Product)
        WHERE Description IS NULL;
    

    Now this will delete all invalid records, provided that they appear in the Dups table.

    Therefore our CBPD14 record will be left untouched, because it does not appear there. The “good” record for CBPD10 will be left untouched because it’s not true that its Description is NULL. All the others – poof.

    Let me state again that if a record has no valid records and yet it is a duplicate, then all copies of that record will be killed – there will be no survivors.

    To avoid this can may first SELECT (using the query above, the check “which should return nothing”) the rows representing this mode of failure into another TEMPORARY TABLE, then INSERT them back into the main table after the deletion (using transactions might be in order).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

After creating a dataset array (data), I want to delete all rows for which
Here comes the trouble. I want to delete all rows from datagridview. This how
I want to delete three rows, from three separate tables. Here is my query,
I have a data table and I want to delete a row here is
I want to delete rows in GlassesColor table that associated with GlassesID in Glasses
I have a tableView that I want to allow editing and delete rows. I
So here are 3 tables I'm working with: I want to delete everything in
I want to delete all the selected rows when the delete button is pressed.
i m new here, i want to update and delete the records from the
i have a simple html table where i want to delete and add rows

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.