I have hundred of thousands of elements to insert into a database. I realized calling an insert statement per element is way too costly and I need to reduce the overhead.
I recon each insert can have multiple data elements specified such as
INSERT INTO example (Parent, DataNameID) VALUES (1,1), (1,2)
My issue is that since the “DataName” keeps repeating itself for each element I thought it would optimize space if I stored these string names in another table and reference it.
However that causes problems for my idea of the bulk insert which now requires a way to actually evaluate the ID from the name before calling the bulk insert.
Any recommendations?
Should I simply de-normalize and insert the data every time as plain string to the table?
Also what is the limit of the size of the string as the string query amounts to almost 1.2 MB?
I am using PHP with MySQL backend
You haven’t given us a lot of info on the database structure or size, but this may be a case where absolute normalization isn’t worth the hassle.
However if you want to keep it normalized and the strings are already in your other table (let’s call it
datanames), you can do something like