This code works for most websites like google, youtube, facebook, etc but it doesn’t work for some websites like technorati:
<?php
$favicon="http://technorati.com/favicon.ico";
$content = file_get_contents($favicon);
file_put_contents('favicon/icon.ico', $content);
echo "<img src=\"http://localhost/test/favicon/icon.ico\" />";
?>
//output:
Warning: file_get_contents(http://technorati.com/favicon.ico)
[function.file-get-contents]: failed to open stream: HTTP request
failed! HTTP/1.1 403 Forbidden in /opt/lampp/htdocs/test/simple.php on
line 3
How can I download the technorati’s favicon ?
Take a look at what happens when you issue the request, using Fiddler or Wireshark for example.
My guess is that the Technorati webserver is configured to deny automated requests, which it probably detects using the User Agent the crawler sends.
Using cURL you can alter the user agent.