I’m facing an issue with query / data integrity.
I’m reading loads of data (many gigabytes) from a MySQL database, processing them using PHP, to re-insert them into the same database in completely different tables. It’s a one-run conversion from an old data structure to a new one, where tons of values need on the fly modification.
SELECT FROM x -> modify in PHP -> INSERT INTO y is way too slow. That takes around 1 second per 1000 records, and we’re talking many many millions here. That’s why I decided to export every INSERT into a text/sql file.
But, what’s the safest way to go here? The data is everything from booleans to BLOB’s. I can’t really be sure of what’s in it and what’s not, but everything needs to get back into place appropriately, including quotes, slashes, etc.
How can I escape without any data loss?
You can use prepared statements to avoid the need to escape the data.
The queries you create for the prepared statements can be concatenated and executed as a group, instead of having one execute call for each row.
The code that creates, groups & concatenates SQL statements together should allow you to easily choose the number of statements concatenated in a group so you can experiment with 100 statements in one exec call vs. 1000 statements in one exec call.
The results should be much better.