Stuck with this particular issue. I have code that get lots of pages from a forum i.e. http://www.q8yat.net . Now I have a loop that uses curl to get the pages from the forum. Everything works fine on my localhost. BUt when I upload the files on my server and try to get the pages I get a connection timeout error usually after a fixed amount of pages are loaded but thats not always. The curl options I am using are :
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => false, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "spider", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 1, // timeout on connect
CURLOPT_TIMEOUT => 1200, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
);
phpinfo of my server: http://topics4today.com/public/02_12_2010/fcrawl/src/phpinfo.php
Ok i believe the forum is using an apache module:’ mod_bwlimited’ to limit the amount of data I can request based on my ip. A possibility.
Ok the forum is using an apache module:’ mod_bwlimited’ to limit the amount of data I can request based on my ip. Issue can be solved by requesting only a limited number of pages in every run of the script. E.g. your script runs, asks for 2 pages, stops, then starts again ( using javascript timer ) and asks for 2 more pages and this goes on in a loop.