Background
I basically have an array,
this array has the following format
Array
(
[0] => Array
(
[co_id] => 1
[co_fname] => First
[co_lname] => Last
[co_company] => Company
[co_address] => Address
[co_ddi] => ddinumber
[co_mobile] => mobilenumber
[co_fax] =>
[co_email] => email@example.com
[co_usms] => 1
[co_ufax] => 0
[co_uemail] => 1
[a_id] => 3
)
)
I am looping over this array and am creating some insert statements.
There are three tables, Message_email, Message_fax, Message_sms.
If A contact has co_u(sms/fax/email) as 1 I add the contacts id, and the respective contact information(co_mobile/co_fax/co_email) to its respective array ($mobile/$fax/$sms).
The array’s information is then added to the tables.
Question
These arrays can get quite large (think 200k+ contacts).
Should I.
a) Create a single bulk insert statement.
b) Create several smaller bulk insert statements.
c) Do an insert statement for each contact.
Speed is good but not so much of an issue.
Reliability is the big one.
Matt
I typically do batches based upon the length of the SQL query. I will build out the query, appending each time until it reaches some easy length (10000 characters is a nice length), and then flush it to the server:
It’s nice because it compensates for small inserts
(int, int)and huge ones(int, text, text)… Plus, you don’t risk the query taking too long and timing out your connection (or blocking other users for too long). With my data sets, 10k characters seems like a sweet spot. Since MAX_PACKET is usally 1mb, you could go as high as 750k or so and still have plenty of room for error (depending on your dataset). But I’d rather use a bunch of smaller batches than one huge one…