So I have a huge update where I have to insert around 40gb data into an innodb table. Its taking quite a while, so Im wondering which method would be the fastest (and more importantly why, as I could just do a split test).
Method 1)
a) Insert all rows
b) create ALTER TABLE su_tmp_matches ADD PRIMARY KEY ( id )
Method 2)
a) ALTER TABLE su_tmp_matches ADD PRIMARY KEY ( id )
b) Insert all rows
Currently we are using method 1, but the step b) seems to take a shitload of time. So Im wondering if there is any implication of the size here (40gb – 5 million rows).
—- so I decided to test this as well —
Pretty quick brand new mysql server – loads and loads of ram, and fast ram, fast discs as well, and pretty tuned up (we have more than 5000 requests per second on one pieces):
1,6 mio rows / 6gb data:
81 seconds to “delete” a primary index
550 seconds to “add” a primary index (after data is added)
120 seconds to create a copy of the table with the primary index create BEFORE data insert
80 seconds to create a copy of the table without the primary index (which then is 550 seconds to create afterwards)
Seems pretty absurd – question is, if indexes are the same thing.
From the documentation :
It seems to me that adding the constraint of unicity before the insert could only help the engine if your column having a primary key is an autoincremented integer. But I really doubt there would be a notable difference.
A useful recommendation :
EDIT : as by experience MySQL doesn’t always perform as expected from the documentation performance-wise, I think any benchmark you do on this would be interesting, even if not a definite answer per se.