Im trying to crawl a secure page (https) such as google with curl
but I seem to get no data back from my crawler
php function
function getDOM($url){
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_RANGE, '0-100');
$content = curl_exec($ch);
curl_close($ch);
echo $url."<br>";
echo $content;
$dom = new simple_html_dom();
$dom->load($content);
if($dom){
return $dom;
}
return null;
}
getDOM("https://www.google.co.uk/search?sugexp=chrome,mod=14&sourceid=chrome&ie=UTF-8&q=crawling%20https#hl=en&gs_nf=1&pq=site:stackoverflow.com%20crawling%20https%20php&cp=6&gs_id=s&xhr=t&q=stackoverflow&pf=p&sclient=psy-ab&oq=stacko&aq=0&aqi=g4&aql=&gs_l=&pbx=1&bav=on.2,or.r_gc.r_pw.r_qf.,cf.osb&fp=8baefeb740f734a5&biw=1280&bih=685");
is there anything I can do to crawl a https as I don’t seem to have this problem with normal pages
Add this to your code. This will allow any certificate to pass through, so it should be fine for your use (but not a good idea in general).