I am trying to perform a cURL request (either directly from shell or via PHP) that will return a URL’s content essentially the same as the request made through a browser (minus any cookies/logins, etc).
A basic cURL request for http://www.google.com will return what appears to be the Japanese version of Google Search with some character encoding issues.
Testing with the options including setting a standard User Agent and follow location still does not result in what I assumed would be a very similar request to my browser. Is there a set of flags I should be using to closely imitate a browser request?
The code below is currently used for testing, but even with cookies being stored Google assumes the location is Japan (google.co.jp).
$header = array(
"Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language: en-us,en;q=0.5",
"Connection: keep-alive",
"Cache-Control: no-cache",
"Content-Type: application/x-www-form-urlencoded; charset=UTF-8",
"Pragma: no-cache",
);
$useragent = 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)';
$ch = curl_init();
curl_setopt($ch, CURLOPT_VERBOSE, 0);
curl_setopt($ch, CURLOPT_URL, $request);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_COOKIEJAR, "my_cookies.txt");
curl_setopt($ch, CURLOPT_COOKIEFILE, "my_cookies.txt");
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$data = curl_exec($ch);
curl_close($ch);
1 Answer