For instance, using this code:
$curl = curl_init();
curl_setopt_array( $curl, array(
CURLOPT_RETURNTRANSFER => true,
CURLOPT_URL => "$url" ) );
curl_exec( $curl );
$header = curl_getinfo( $curl, CURLINFO_HTTP_CODE );
curl_close( $curl );
$url = "http://upenn.edu" will not work, while $url = "http://www.upenn.edu" will work.
Without the www. the response code I get is 0, whereas with the www. it is 200.
If I were to use PHP get_headers("http://upenn.edu"), I would get two errors:
Warning: get_headers() [function.get-headers]: php_network_getaddresses: getaddrinfo failed: nodename nor servname provided, or not known
and
Warning: get_headers(http://upenn.edu) [function.get-headers]: failed to open stream: php_network_getaddresses: getaddrinfo failed: nodename nor servname provided, or not known
However, when I use the exact same code, http://google.com will work (as well as the expected http://www.google.com.)
Then, for a website such as http://www.dogpile.com, the www. part included returns a response code of 0 whereas without the www., I get a 302.
Why is this? and is there a better method to use in order to ensure reliable results (i.e., where a www. is not present, yet the response code is still returned?)
I am new to using cURL and dealing with headers and response codes, so any help is appreciated. Thank you.
Your question, even asked because of using curl now, is actually something totally independent to curl. Other client http libraries will be the same with these examples because it is related to the domain name system and services running on a computer.
Curl is a HTTP library. If you do a HTTP request, by default you will try to connect to port 80 on a remote computer.
The remote computer is identified by an IP address. That is a number like
173.194.35.134– you probably know that already.Most often not the numbers are used but some domain names, for example
google.comfor173.194.35.134.So telling curl to use the URI
http://google.com/will open a connection toThe domain name system will resolve the domain
google.comto the IP address.Domain names can be organized in levels. Each level is separated by a dot
.. The so called Top Level Domain (TLD) is the part most on the right, forgoogle.comthat iscom. The Second Level Domain (SLD) is respectivelygooglethen. And withwww.google.comyou have another domain name, with three levels then. Thewwwis commonly refered to as Subdomain.The most important part here is that for every different domain the DNS system can return a different IP address.
Therefore
www.google.comandgoogle.comcan be two totally different things. Thewwwsubdomain is only a common convention to name the webserver on a network organized with aSLD.TLD.So by this being common you could try both and see which one works. However I would not try more than with and w/o
www.