could you please help me to find what cause this process to reach 500MB of memory usage.
It is basically an html page downloader.
Despite the fact that the process is stable (and do not exceed that limit), it’ meant to use on low performing machine and I’m not satisfied.
The size of the mysql table ‘Sites’ is 170MB.
following the script code.
Thanks in advance.
function start() {
try {
global $log;
$db = getConnection();
Zend_Db_Table::setDefaultAdapter($db);
$log->logInfo("logger start");
while (1) {
$sitesTable = new Zend_Db_Table('Sites');
$rowset = $sitesTable->fetchAll();
foreach ($rowset as $row) {
if (time() >= (strtotime($row->lastUpdate) + $row->pollingHours * 60 * 60)) {
db_updateHtml($row);
}
}
}
} catch (Exception $e) {
global $log;
$log->logError($e->getMessage());
}
}
function db_updateHtml($siteRecord) {
try {
if ($siteRecord instanceof Zend_Db_Table_Row) {
$rowwithConnection = $siteRecord;
$url = $siteRecord->url;
$idSite = $siteRecord->idSite;
$crawler = new Crawler();
$sitesTable = new Zend_Db_Table('Sites');
//$rowwithConnection = $sitesTable->fetchRow(
// $sitesTable->select()->where('idSite = ?', $idSite));
$newHtml = HtmlDbEncode($crawler->get_web_page($url));
if (strlen($newHtml) < 10) {
global $log;
$log->logError("Download failed for: url: $url \t idsite: $idSite ");
}
if ($rowwithConnection->isChecked != 0) {
$rowwithConnection->oldHtml = $rowwithConnection->newHtml;
$rowwithConnection->isChecked = 0;
}
$rowwithConnection->newHtml = $crawler->get_web_page($url);
$rowwithConnection->lastUpdate = date("Y-m-d H:i:s");
//$rowwithConnection->diffHtml = getDiff($rowwithConnection->oldHtml, $rowwithConnection->newHtml, false, $rowwithConnection->minLengthChange);
$rowwithConnection->diffHtml = getDiffFromRecord($rowwithConnection, false, $rowwithConnection->minLengthChange);
/* if (strlen($rowwithConnection->diffHtml) > 30) {
$rowwithConnection->lastChanged = $rowwithConnection->lastUpdate;
} */
$rowwithConnection->save();
} else {
$log->logCrit("siteRecord is uninitialized");
}
} catch (Exception $e) {
global $log;
$log->logError($e->getMessage());
}
}
function getDiffFromRecord($row, $force = false, $minLengthChange = 100) {
if ($row instanceof Zend_Db_Table_Row) {
require_once '/var/www/diff/library/finediff.php';
include_once '/var/www/diff/library/Text/Diff.php';
$diff = new AndreaDiff();
$differences = $diff->getDiff($row->oldHtml, $row->newHtml);
if ($diff->isChanged($minLengthChange) || $force) {
$row->lastChanged = $row->lastUpdate;
$row->isChecked = false;
return ($differences);
}
}
return null;
}
function getConnection() {
try {
$pdoParams = array(
PDO::MYSQL_ATTR_USE_BUFFERED_QUERY => true
);
$db = new Zend_Db_Adapter_Pdo_Mysql(array(
'host' => '127.0.0.1',
'username' => 'root',
'password' => 'administrator',
'dbname' => 'diff',
'driver_options' => $pdoParams
));
return $db;
} catch (Exception $e) {
global $log;
$log->logError($e->getMessage());
}
}
1) Try use fetch method, not fetchAll:
2) try to unset all variables which store html code (if you save it in memory), at last iteration i suppose variable
$rowwithConnectionwill have html code inside.When i want profile php application i use xhprof it will save you a LOT of time. Good Luck!