Right, this code goes through a rather large multidimensional array (has about 28,000 rows and 16 parts).
Order of events:
- Check if the data exists in the database
- if it exists – Update it with the new data
- if it doesn’t exist – Insert it
Simple.
But right now to go through this it has taken over 30min i think and Still going.
$counter = 0;
$events = sizeof($feed_array['1'])-1;
while($counter <= $events ) {
$num_rows = mysql_num_rows(mysql_query("SELECT * FROM it_raw WHERE perfID = '".addslashes($feed_array['16'][$counter])."'"));
if($num_rows) {
$eventDate=explode("/", $feed_array['1'][$counter]); //print_r($eventDate);
$eventTime=explode(":", $feed_array['2'][$counter]); //print_r($eventTime);
$eventUnixTime=mktime($eventTime[0], $eventTime[1], "00", $eventDate[1], $eventDate[0], $eventDate[2]);
mysql_query("UPDATE `it_raw` SET
`eventtime` = '".$eventUnixTime."',
`eventname` = '".addslashes($feed_array['3'][$counter])."',
`venuename` = '".addslashes($feed_array['4'][$counter])."',
`venueregion` = '".addslashes($feed_array['5'][$counter])."',
`venuepostcode` = '".addslashes($feed_array['6'][$counter])."',
`country` = '".addslashes($feed_array['7'][$counter])."',
`minprice` = '".addslashes($feed_array['8'][$counter])."',
`available` = '".addslashes($feed_array['9'][$counter])."',
`link` = '".addslashes($feed_array['10'][$counter])."',
`eventtype` = '".addslashes($feed_array['11'][$counter])."',
`seaOnSaleDate` = '".addslashes($feed_array['12'][$counter])."',
`perOnSaleDate` = '".addslashes($feed_array['13'][$counter])."',
`soldOut` = '".addslashes($feed_array['14'][$counter])."',
`eventImageURL` = '".addslashes($feed_array['15'][$counter])."',
`perfID`= '".addslashes($feed_array['16'][$counter])."'
WHERE `perfID` = ".$feed_array['16'][$counter]." LIMIT 1 ;");
echo "UPDATE ".$feed_array['16'][$counter].": ".addslashes($feed_array['3'][$counter])."\n";
} else {
$eventDate=explode("/", $feed_array['1'][$counter]); //print_r($eventDate);
$eventTime=explode(":", $feed_array['2'][$counter]); //print_r($eventTime);
$eventUnixTime=mktime($eventTime[0], $eventTime[1], "00", $eventDate[1], $eventDate[0], $eventDate[2]);
$sql = "INSERT INTO `dante_tickets`.`it_raw` (
`id` ,
`eventtime` ,
`eventname` ,
`venuename` ,
`venueregion` ,
`venuepostcode` ,
`country` ,
`minprice` ,
`available` ,
`link` ,
`eventtype` ,
`seaOnSaleDate` ,
`perOnSaleDate` ,
`soldOut` ,
`eventImageURL` ,
`perfID`
)
VALUES (
NULL ,
'".$eventUnixTime."',
'".addslashes($feed_array['3'][$counter])."',
'".addslashes($feed_array['4'][$counter])."',
'".addslashes($feed_array['5'][$counter])."',
'".addslashes($feed_array['6'][$counter])."',
'".addslashes($feed_array['7'][$counter])."',
'".addslashes($feed_array['8'][$counter])."',
'".addslashes($feed_array['9'][$counter])."',
'".addslashes($feed_array['10'][$counter])."',
'".addslashes($feed_array['11'][$counter])."',
'".addslashes($feed_array['12'][$counter])."',
'".addslashes($feed_array['13'][$counter])."',
'".addslashes($feed_array['14'][$counter])."',
'".addslashes($feed_array['15'][$counter])."',
'".addslashes($feed_array['16'][$counter])."'
);";
mysql_query($sql) or die(mysql_error().":".$sql);
echo "Inserted ".$feed_array['16'][$counter].": ".addslashes($feed_array['3'][$counter])."\n";
}
unset($sql);
$counter++;
}
UPDATE
I just carried out profiling one one of the rows:
mysql> EXPLAIN SELECT * FROM it_raw WHERE perfID = 210968;
+----+-------------+--------+------+---------------+--------+---------+-------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+------+---------------+--------+---------+-------+------+-------+
| 1 | SIMPLE | it_raw | ref | perfID | perfID | 4 | const | 1 | |
+----+-------------+--------+------+---------------+--------+---------+-------+------+-------+
1 row in set (0.07 sec)
UPDATE 2
To try and “speed” things up, instead of carrying out the UPDATE and INSERT statements straight away, i’ve now placed them in a variable (so only the initial select runs – to check a duplicate – then stores the action [insert or update]). At the end of the loop it executes all the statments.
Except now, it’s coming up with MySQL error that the syntax is incorrect. (when initially there was nothing wrong).
I’ve simply replaced the mysql_query with:
$sql_exec .= “SELECT…. ;”;
is there something i’m missing here for the formatting?
UPDATE 3
OK finally fixed it
Lessons Learned:
- Do logic search first on database
- Carry out insert/updates in bulk.
Here is the final code which now takes about 60 seconds to run (from over 30min+)
while($counter <= $events ) {
$num_rows = mysql_num_rows(mysql_query("SELECT * FROM it_raw WHERE perfID = '".addslashes($feed_array['16'][$counter])."'"));
if($num_rows) {
$eventDate=explode("/", $feed_array['1'][$counter]); //print_r($eventDate);
$eventTime=explode(":", $feed_array['2'][$counter]); //print_r($eventTime);
$eventUnixTime=mktime($eventTime[0], $eventTime[1], "00", $eventDate[1], $eventDate[0], $eventDate[2]);
$sql_exec[] = "UPDATE `it_raw` SET `eventtime` = '".$eventUnixTime."',`eventname` = '".addslashes($feed_array['3'][$counter])."',`venuename` = '".addslashes($feed_array['4'][$counter])."',`venueregion` = '".addslashes($feed_array['5'][$counter])."',`venuepostcode` = '".addslashes($feed_array['6'][$counter])."',`country` = '".addslashes($feed_array['7'][$counter])."',`minprice` = '".addslashes($feed_array['8'][$counter])."',`available` = '".addslashes($feed_array['9'][$counter])."',`link` = '".addslashes($feed_array['10'][$counter])."',`eventtype` = '".addslashes($feed_array['11'][$counter])."',`seaOnSaleDate` = '".addslashes($feed_array['12'][$counter])."',`perOnSaleDate` = '".addslashes($feed_array['13'][$counter])."',`soldOut` = '".addslashes($feed_array['14'][$counter])."',`eventImageURL` = '".addslashes($feed_array['15'][$counter])."',`perfID`='".addslashes($feed_array['16'][$counter])."' WHERE `perfID` = ".$feed_array['16'][$counter]." LIMIT 1;";
echo "UPDATE ".$feed_array['16'][$counter].": ".addslashes($feed_array['3'][$counter])."\n";
} else {
$eventDate=explode("/", $feed_array['1'][$counter]); //print_r($eventDate);
$eventTime=explode(":", $feed_array['2'][$counter]); //print_r($eventTime);
$eventUnixTime=mktime($eventTime[0], $eventTime[1], "00", $eventDate[1], $eventDate[0], $eventDate[2]);
$sql_exec[] = "INSERT INTO `it_raw` (`id` ,`eventtime` ,`eventname` ,`venuename` ,`venueregion` ,`venuepostcode` ,`country` ,`minprice` ,`available` ,`link` ,`eventtype` ,`seaOnSaleDate` ,
`perOnSaleDate` ,`soldOut` ,`eventImageURL` ,`perfID`) VALUES ( NULL ,'".$eventUnixTime."','".addslashes($feed_array['3'][$counter])."','".addslashes($feed_array['4'][$counter])."','".addslashes($feed_array['5'][$counter])."','".addslashes($feed_array['6'][$counter])."','".addslashes($feed_array['7'][$counter])."','".addslashes($feed_array['8'][$counter])."','".addslashes($feed_array['9'][$counter])."','".addslashes($feed_array['10'][$counter])."','".addslashes($feed_array['11'][$counter])."','".addslashes($feed_array['12'][$counter])."','".addslashes($feed_array['13'][$counter])."','".addslashes($feed_array['14'][$counter])."','".addslashes($feed_array['15'][$counter])."','".addslashes($feed_array['16'][$counter])."');";
//mysql_query($sql) or die(mysql_error().":".$sql);
echo "Inserted ".$feed_array['16'][$counter].": ".addslashes($feed_array['3'][$counter])."\n";
}
unset($sql);
$counter++;
}
foreach($sql_exec as $value) {
mysql_query($value) or die (mysql_error().": ".$value);
}
You could try grouping inserts and updates into groups so the code runs less queries.
For example, you could group all of the inserts into one very large insert, or maybe group every 100 inserts.
Also using prepared statements as gradbot suggested may help.
Other than that, it’s not very easy to say which part of it is the major contributor to slowness. You should use a profiler to determine that, for example by using a smaller dataset so the profiled script runs in a reasonable time.