I have a large quantity of data in a production database that I want

Question

0

Asked: May 21, 20262026-05-21T07:29:25+00:00 2026-05-21T07:29:25+00:00

I have a large quantity of data in a production database that I want

0

I have a large quantity of data in a production database that I want to update with batches of data while the data in the table is still available for end user use. The updates could be insertion of new rows or updates of existing rows. The specific table is approximately 50M rows, and the updates will be between 100k – 1M rows per “batch”. What I would like to do is insert replace with a low priority.. In other words, I want the database to kind of slowly do the batch import without impacting performance of other queries that are occurring concurrently to the same disk spindles. To complicate this, the update data is heavily indexed. 8 b-tree indexes across multiple columns to facilitate various lookup that adds quite a bit of overhead to the import.

I’ve thought about batching the inserts down into 1-2k record blocks, then having the external script that loads the data just pause for a couple seconds between each insert, but that’s really kind of hokey IMHO. Plus, during a 1M record batch, I really don’t want to add 500-1000 2second pauses to add 20-40 minutes of extra load time if its not needed. Anyone have ideas on a better way to do this?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-21T07:29:26+00:00

I’ve dealt with a similar scenario using InnoDB and hundreds of millions of rows. Batching with a throttling mechanism is the way to go if you want to minimize risk to end users. I’d experiment with different pause times and see what works for you. With small batches you have the benefit that you can adjust accordingly. You might find that you don’t need any pause if you run this all sequentially. If your end users are using more connections then they’ll naturally get more resources.

If you’re using MyISAM there’s a LOW_PRIORITY option for UPDATE. If you’re using InnoDB with replication be sure to check that it’s not getting too far behind because of the extra load. Apparently it runs in a single thread and that turned out to be the bottleneck for us. Consequently we programmed our throttling mechanism to just check how far behind replication was and pause as needed.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have a large quantity of data in a production database that I want

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply