I want to give a url with curl..and fetch it based on its header attribute Expires.
I want to retrieve the page only if it was cached in the last 30 days.
Two things that I think arent right…
1) The gmmktime(0, 0, 0, 1, 1, 1998).. I am not sure how to set it to today – 30 days ago.
2) Whether it will return me google based on its headers? what the $page variable will be if the url has no cached headers with date older than 30 days
function exractURl()
{
//How to convert gmmktime to the last 30 days from today
$ts = gmdate("D, d M Y H:i:s", gmmktime(0, 0, 0, 1, 1, 1998)) . " GMT";
$c= curl_init('http://www.google.co.il/');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_HTTPHEADER, array('Expires:'.$ts));
// What output will page give me..if the headers arent found
$page= curl_exec($c);
curl_close($c);
}
UPDATE:
function exractURl()
{
$ts = gmdate("D, d M Y H:i:s", strtotime("30 days ago")) . " GMT";
$c= curl_init('http://www.google.co.il/');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
curl_setopt($c, CURLOPT_HTTPHEADER, array('If-Modified-Since:'.$ts));
$page= curl_exec($c);
curl_close($c);
return $page;
}
You can use the
If-Modified-Sinceto ask the server to only return the content if it has changed (Otherwise you’ll get a304 Not Modifiedresponse). Of course this relies on the server behaving. See here for more details: http://www.mnot.net/cache_docs/And to answer your question on how to get the time as of 30 days ago, you can use the ever convenient
strtotime: