I have postgresql db which i am updating with around 100000 records. I use session.merge() to insert/update each record and i do a commit after every 1000 records.
i=0
for record in records:
i+=1
session.merge(record)
if i%1000 == 0:
session.commit()
This code works fine. In my database i have a table with a UNIQUE field and there are some duplicated records that i insert into it. A error is thrown when this happens, saying the field is not unique. Since i am inserting 1000 records at a time, a rollback will not help me to skip these records. is there any way i can skip the session.merge() for the duplicate records (other than parsing through all the records to find the duplicate records of course)?
This is the option which works best for me because the number of records with duplicate unique keys is minimal.
Say, if i hit an error at 2375 I store the primary key ‘pk’ for the 2375 record in failed_records and then i recommit from 2000 to 2375. It seems much faster than doing commits one by one.