This is a two part question.
Q1: Can cURL based request 100% imitate a browser based request?
Q2: If yes, what all options should be set. If not what extra does the browser do that cannot bee imitated by cURL?
I have a website and I see thousands of request being made from a single IP in a very short time. These requests harvest all my data. When looked at the log to identify the agent used, it looks like a request from browser. So was curious to know if its a bot and not a user.
Thanks in advance
R1 : I suppose, if you set all the correct headers, that, yes, a curl-based request can imitate a browser-based one : after all, both send an HTTP request, which is just a couple of lines of text following a specific convention (namely, the HTTP RFC)
R2 : The best way to answer that question is to take a look at what your browser is sending ; with Firefox, for instance, you can use either Firebug or LiveHTTPHeaders to get that.
For instance, to get this page, Firefox sent those request headers :
(I Just removed a couple of informations — but you get the idea 😉 )
Using curl, you can work with
curl_setoptto set the HTTP headers ; here, you’d probably have to use a combination ofCURLOPT_HTTPHEADER,CURLOPT_COOKIE,CURLOPT_USERAGENT, …