- lets assume you have a table with 1M rows and growing …
- every five minutes of every day you run a python programm which have to update some fields of 50K rows
my question is: what is the fastest way to do the work?
- runs those updates in loop and after last one is executed than fire up a cursor commit?
- or generate file and than run it throught command line?
- create temp table by huge and fast insert and than run a single update to production table?
- do prepared statements?
- split it up to 1K updates per execute, to generate smaller logs files?
- turn off logging while running update?
- or do a cases in mysql examples (but this works only up to 255 rows)
i dont know … have anyone do something like this? what is the best practise? i need to run it as fast as possible …
Here’s some ways you could speed up your
UPDATES.When you
UPDATE, the table records are just being rewritten with new data. And all this must be done again onINSERT. That’s why you should always useINSERT ... ON DUPLICATE KEY UPDATEinstead ofREPLACE.The former one is an
UPDATEoperation in case of a key violation, while the latter one isDELETE / INSERTHere’s an example
INSERT INTO table (a,b,c) VALUES (1,2,3) ON DUPLICATE KEY UPDATE c=c+1;More on this here.UPDATE1: It’s a good idea to do your inserts all in a single query. This should speed up your
UPDATES. See here on how to do that.UPDATE2: Now that I have had a chance to read your other sub-questions. Here’s what I know-
UPDATEin a single sql & single commit.SQLqueries are more important.TABLE& theINDEXESyou have, plusINNODB or MYISAM.slow_query_logturned on. This logs all slow queries to a separate logfile. Turn this off too.