I’ll give you the gist.
I’m trying to scrape certain URL’s using a third party HTML tag stripper because I don’t think the default strip_tags() does the job well. (I don’t think you need to check that scraper)
Now sometimes, the HTML source code of some sites contains some weird code that is causing my HTML tag stripper to fail.
One such example is this site that contains the following piece of code :
<li><a href="<//?=$cnf['website']?>girls/models-photo-gallery/?sType=6#top_menu">Photo Galleries</a></li>
that causes the above mentioned tag stripper to throw this error :
Parse error: syntax error, unexpected
T_ENCAPSED_AND_WHITESPACE, expecting T_STRING or T_VARIABLE or
T_NUM_STRING in /var/www/GET
Tweets/htdocs/tmhOAuth-master/examples/class.html2text.inc(429) :
regexp code on line 1Fatal error:
preg_replace() [<a
href=’function.preg-replace’>function.preg-replace</a>]:
Failed evaluating code:
$this->_build_link_list("<//?=$cnf[\’website\’]?>girls/models-photo-gallery/?sType=6#top_menu",
"Photo Galleries") in /var/www/GET
Tweets/htdocs/tmhOAuth-master/examples/class.html2text.inc on line
429
Now what happens is, there is an array of many URLs and some throw the abovementioned error. I do some processing on each URL.
If some URL in the array throws an error like this, I want the execution to proceed ahead with processing of next URL without it disturbing anything. My code is something like this:
foreach ($results as $result)
{
$url=$result->Url;
$worddict2=myfunc($url,$worddict2,$history,$n_gram);
}
Here myfunc does the processing and uses the 3rd party HTML stripper I mentioned before.
I tried modifying the code to this:
foreach ($results as $result)
{
$url=$result->Url;
$worddicttemp=array();
try
{
$worddicttemp=myfunc($url,$worddict2,$history,$n_gram); //returns the string represenation of what matters, hopefully
//The below line will be executed only when the above function doesn't throw a fatal error
$worddict2=$worddicttemp;
}
catch(Exception $e)
{
continue;
}
}
But I’m still getting the same error.
What is wrong? Why is the code inside myfunc() now transferring control to the catch blocks as soon as it encounters that fatal error?
You can’t catch Parse Errors (or any Fatal Errors for that matter, but Parse Errors are even worse since they’ll be generated as soon as the code is loaded). The best way I know of to isolate them is to run completely independent PHP processes for whatever you want to recover from and expect to generate Fatal Errors.
See also How do I catch a PHP Fatal Error