I wrote this python script to import a specific xls file into mysql. It

Question

0

Asked: June 13, 20262026-06-13T10:43:02+00:00 2026-06-13T10:43:02+00:00

I wrote this python script to import a specific xls file into mysql. It

0

I wrote this python script to import a specific xls file into mysql. It works fine but if it’s run twice on the same data it will create duplicate entries. I’m pretty sure I need to use MySQL JOIN but I’m not clear on how to do that. Also is executemany() going to have the same overhead as doing inserts in a loop? I’m obviously trying to avoid that.
Here’s the code in question…

for row in range(sheet.nrows):
    """name is in the 0th col. email is the 4th col."""
    name = sheet.cell(row, 0).value  
    email =  sheet.cell(row, 4).value
    if name and email:
        mailing_list[name.lstrip()] = email.strip()

for n, e in sorted(mailing_list.iteritems()):
    rows.append((n, e))

db = MySQLdb.connect(host=host, user=user, db=dbname, passwd=pwd)
cursor = db.cursor()
cursor.executemany("""
        INSERT IGNORE INTO mailing_list (name, email) VALUES (%s,%s)""",(rows))

CLARIFICATION…

I read here that…

To be sure, executemany() is effectively the same as simple iteration.
However, it is typically faster. It provides an optimized means of
affecting INSERT and REPLACE across multiple rows.

Also I took Unodes suggestion and used the UNIQUE constraint. But the IGNORE keyword is better than ON DUPLICATE KEY UPDATE because I want it to fail silently.

TL;DR

~~1. What’s the best way prevent duplicate inserts?~~
ANSWER 1: UNIQUE contraint on column with SELECT IGNORE to fail silently or ON DUPLICATE KEY UPDATE to increment the duplicate value and insert it.

Is executemany() as expensive as INSERT in a loop?
@Unode says it’s not but my research tells me otherwise. I would like a definitive answer.
Is this the best way or is it going to be really slow with bigger
tables and how would I test to be sure?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-13T10:43:03+00:00

1 – What’s the best way prevent duplicate inserts?

Depending on what “preventing” means in your case, you have two strategies and one requirement.

The requirement is that you add a UNIQUE constraint on the column/columns that you want to be unique. This alone will cause an error if insertion of a duplicate entry is attempted. However given you are using executemany the outcome may not be what you would expect.

Then as strategies you can do:

An initial filter step by running a SELECT statement before. This means running one SELECT statement per item in your rows to check if it exists already. This strategy works but is inefficient.
Using ON DUPLICATE KEY UPDATE. This automatically triggers an update if the data already exists. For more information refer to the official documentation.

2 – Is executemany() as expensive as INSERT in a loop?

No, executemany creates one query which inserts in bulk while doing a for loop will create as many queries as the number of elements in your rows.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I wrote this python script to import a specific xls file into mysql. It

CLARIFICATION…

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply