I have mistakenly loaded duplicate files into a database table (IBM DB2 v9.7). I

Question

0

Asked: June 11, 20262026-06-11T11:04:36+00:00 2026-06-11T11:04:36+00:00

I have mistakenly loaded duplicate files into a database table (IBM DB2 v9.7). I

0

I have mistakenly loaded duplicate files into a database table (IBM DB2 v9.7). I need to delete the duplicate records without deleting valid data.

Initially, I though of HAVING count(*) > 1 as the solution to my problem but this will not work. Our supplier produces parts with modified specs so a file may be loaded more than once with valid data.

I know a few things:

the date range for my duplicate records: between ‘2012-08-27’ and
‘2012-09-02’
the attributes to use to validate data

This is my SQL code to identify the dupes:

SELECT CAST(ENDDATE AS DATE) ENDDATE,CAST(LOADEDON AS DATE),SUBSTR(SITEID,1,20) SITEID,SUBSTR(LOCATIONNAME_1,1,20),SUBSTR(RID,1,15),COUNT(RID) FROM AUTOMATION WHERE CAST(ENDDATE AS DATE) BETWEEN '2012-08-27' AND '2012-09-02' GROUP BY CAST(ENDDATE AS DATE),CAST(LOADEDON AS DATE),SUBSTR(SITEID,1,20),SUBSTR(LOCATIONNAME_1,1,20),SUBSTR(RID,1,15) ORDER BY 5 ASC FOR FETCH ONLY WITH UR

EDIT: set of columns that can be used to specify a duplicate are RID,LOADEDON and FILENAME (not shown here).

This is a sample output

08/29/2012 09/05/2012 JGS Memphis          JGS Memphis          029369751671            518
09/01/2012 09/05/2012 Reynosa              Reynosa              029054883474            521
08/29/2012 09/05/2012 JGS Memphis          JGS Memphis          028881223425            522

I want to delete all the duplicate records in the timeframe ‘2012-08-27’ AND ‘2012-09-02’ without deleting the records that are loaded N times for legit reasons.

Note: the table does not have a primary key (like Rowid in MS Sqlserver, for instance)

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-11T11:04:38+00:00

I can’t quite tell which set of columns specifies a duplicate. The following assumes that it is the columns in your sample output:

delete from (select t.*,
                    row_number() over (partition by enddate, loadedon, siteid order by loadedon desc) as seqnum
             from automation t
            ) t
where seqnum > 1

This uses row_number() to assign sequential numbers and deletes all but the first row, guaranteeing that one stays in the database.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have mistakenly loaded duplicate files into a database table (IBM DB2 v9.7). I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply