Using PHP how can I accurately test that a remote website supports the “If-Modified-Since” HTTP header.
From what I have read, if the remote file you GET has been modified since the date specified in the header request – it should return a 200 OK status. If it hasn’t been modified, it should return a 304 Not Modified.
Therefore my question is, what if the server doesn’t support “If-Modified-Since” but still returns a 200 OK?
There are a few tools out there that check if your website supports “If-Modified-Since” so I guess I’m asking how they work.
Edit:
I have performed some testing using Curl, sending the following;
curl_setopt($ch, CURLOPT_HTTPHEADER, array("If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',time()+60*60*60*60)));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FORBID_REUSE, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
i.e. a date in the future google.com returns;
HTTP/1.0 304 Not Modified
Date: Fri, 05 Feb 2010 16:11:54 GMT
Server: gws
X-XSS-Protection: 0
X-Cache: MISS from .
Via: 1.0 .:80 (squid)
Connection: close
and if I send;
curl_setopt($ch, CURLOPT_HTTPHEADER, array("If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',time()-60*60*60*60)));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FORBID_REUSE, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
i.e. a date in the past, google.com returns;
HTTP/1.0 200 OK
Date: Fri, 05 Feb 2010 16:09:12 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 0
X-Cache: MISS from .
Via: 1.0 .:80 (squid)
Connection: close
If I then send both to bbc.co.uk (which doesn’t support it);
The future one returns;
HTTP/1.1 200 OK
Date: Fri, 05 Feb 2010 16:12:51 GMT
Server: Apache
Set-Cookie: BBC-UID=84bb66bc648318e367bdca3ad1d48cf627005b54f090f211a2182074b4ed92c40ForbSoft%20Web%20Diagnostics%20%28URL%20Validator%29; expires=Tue, 04-Feb-14 16:12:51 GMT; path=/; domain=bbc.co.uk;
Accept-Ranges: bytes
Cache-Control: max-age=0
Expires: Fri, 05 Feb 2010 16:12:51 GMT
Pragma: no-cache
Content-Length: 111677
Content-Type: text/html
The date in the past returns;
HTTP/1.1 200 OK
Date: Fri, 05 Feb 2010 16:14:01 GMT
Server: Apache
Set-Cookie: BBC-UID=841b66ec44232cd91e81e88a014a3c5e50ed4e20c0e07174c4ff59675cd2fa210ForbSoft%20Web%20Diagnostics%20%28URL%20Validator%29; expires=Tue, 04-Feb-14 16:14:01 GMT; path=/; domain=bbc.co.uk;
Accept-Ranges: bytes
Cache-Control: max-age=0
Expires: Fri, 05 Feb 2010 16:14:01 GMT
Pragma: no-cache
Content-Length: 111672
Content-Type: text/html
So my question still stands.
I have performed some testing on this and it appears to work as follows;
If you send an If-Modified-Since header with a date that is in the past (5 mins previous to the current time should do it) then sites such as google.com, w3.org, mattcutts.com will return a “HTTP/1.1 304 Not Modified” header. Sites such as yahoo.com, bbc.co.uk and stackoverflow.com always return a “HTTP/1.1 200 OK”.
The “Last-Modified” header has nothing to do with “If-Modified-Since” because the whole point of sending back a “HTTP/1.1 304 Not Modified” header is that you don’t have to send the body with it (thus saving bandwidth – which is the whole point behind this).
Therefore, the answer to my question is that if a site doesn’t return a “HTTP/1.1 304 Not Modified” header when you send an “If-Modified-Since 5 mins ago” header, the site doesn’t support the “If-Modified-Since” request properly.
If I am incorrect, please say so and provide testing to show.
Edit: I forgot to add that a good test is to make a normal HEAD request to the domain (e.g. w3.org), grab the “Last Modified” date and then make another request with “If-Modified-Since:”. This will test that both the “Last Modified” value and “If-Modified-Since” request are supported. Please Note: just because the server sends back a “Last Modified” date doesn’t mean it supports “If-Modified-Since”