Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8738361
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 13, 20262026-06-13T10:43:02+00:00 2026-06-13T10:43:02+00:00

I wrote this python script to import a specific xls file into mysql. It

  • 0

I wrote this python script to import a specific xls file into mysql. It works fine but if it’s run twice on the same data it will create duplicate entries. I’m pretty sure I need to use MySQL JOIN but I’m not clear on how to do that. Also is executemany() going to have the same overhead as doing inserts in a loop? I’m obviously trying to avoid that.
Here’s the code in question…

for row in range(sheet.nrows):
    """name is in the 0th col. email is the 4th col."""
    name = sheet.cell(row, 0).value  
    email =  sheet.cell(row, 4).value
    if name and email:
        mailing_list[name.lstrip()] = email.strip()

for n, e in sorted(mailing_list.iteritems()):
    rows.append((n, e))

db = MySQLdb.connect(host=host, user=user, db=dbname, passwd=pwd)
cursor = db.cursor()
cursor.executemany("""
        INSERT IGNORE INTO mailing_list (name, email) VALUES (%s,%s)""",(rows))

CLARIFICATION…

I read here that…

To be sure, executemany() is effectively the same as simple iteration.
However, it is typically faster. It provides an optimized means of
affecting INSERT and REPLACE across multiple rows.

Also I took Unodes suggestion and used the UNIQUE constraint. But the IGNORE keyword is better than ON DUPLICATE KEY UPDATE because I want it to fail silently.

TL;DR

1. What’s the best way prevent duplicate inserts?
ANSWER 1: UNIQUE contraint on column with SELECT IGNORE to fail silently or ON DUPLICATE KEY UPDATE to increment the duplicate value and insert it.

  1. Is executemany() as expensive as INSERT in a loop?
    @Unode says it’s not but my research tells me otherwise. I would like a definitive answer.
  2. Is this the best way or is it going to be really slow with bigger
    tables and how would I test to be sure?
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-13T10:43:03+00:00Added an answer on June 13, 2026 at 10:43 am

    1 – What’s the best way prevent duplicate inserts?

    Depending on what “preventing” means in your case, you have two strategies and one requirement.

    The requirement is that you add a UNIQUE constraint on the column/columns that you want to be unique. This alone will cause an error if insertion of a duplicate entry is attempted. However given you are using executemany the outcome may not be what you would expect.

    Then as strategies you can do:

    • An initial filter step by running a SELECT statement before. This means running one SELECT statement per item in your rows to check if it exists already. This strategy works but is inefficient.

    • Using ON DUPLICATE KEY UPDATE. This automatically triggers an update if the data already exists. For more information refer to the official documentation.

    2 – Is executemany() as expensive as INSERT in a loop?

    No, executemany creates one query which inserts in bulk while doing a for loop will create as many queries as the number of elements in your rows.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying to write my very first python script. This was working but
I wrote this code for zoom in/out . it works but even with one
I am trying to write a python script that takes record data like this
Just for fun, I wrote this simple function to reverse a string in Python:
I wrote a basic Hippity Hop program in C, Python, and OCaml. Granted, this
I am trying to write this loop in Python but get confused. Basically I
I wrote this as a simple dice game. It works as I want except
Using Python 2.6, I wrote a script in Windows XP. The script does the
I wrote this script to interact with nessus and tell me what reports were
So I wrote a little script in python that brings up a gui with

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.