In Apache’s mod_expires module, there is the Expires directive with two base time periods, access, and modification.
ExpiresByType text/html 'access plus 30 days'
understandably means that the cache will request for fresh content after 30 days.
However,
ExpiresByType text/html 'modification plus 2 hours'
doesn’t make intuitive sense.
How does the browser cache know that the file has been modified unless it makes a request to the server? And if it is making a call to the server, what is the use of caching this directive? It seems to me that I am not understanding some crucial part of caching. Please enlighten me.
An
Expires*directive with ‘modification’ as its base refers to the modification time of the file on the server. So if you set, say, ‘modification plus 2 hours’, any browser that requests content within 2 hours after the file is modified (on the server) will cache that content until 2 hours after the file’s modification time. And the browser knows when that time is because the server sends anExpiresheader with the proper expiration time.Let me explain with an example: say your Apache configuration includes the line
and you have a file
index.html, which theExpiresDefaultdirective applies to, on the server. Suppose you upload a version ofindex.htmlat 9:53 GMT, overwriting the previous existingindex.html(if there was one). So now the modification time ofindex.htmlis 9:53 GMT. If you were runningls -lon the server (ordiron Windows), you would see it in the listing:Now, with every request, Apache sends the
Last-Modifiedheader with the last modification time of the file. Since you have thatExpiresDefaultdirective, it will also send theExpiresheader with a time equal to the modification time of the file (9:53) plus two hours. So here is part of what the browser sees:If the time at which the browser makes this request is before 11:53 GMT, the browser will cache the page, because it has not yet expired. So if the user first visits the page at 11:00 GMT, and then goes to the same page again at 11:30 GMT, the browser will see that its cached version is still valid and will not (or rather, is allowed not to) make a new HTTP request.
If the user goes to the page a third time at 12:00 GMT, the browser sees that its cached version has now expired (it’s after 11:53) so it attempts to validate the page, sending a request to the server with a If-Modified-Since header. A 304 (not modified) response with no body will be returned since the page’s date has not been altered since it was first served. Since the expiry date has passed — the page is ‘stale’ — a validation request will be made every subsequent time the page is visited until validation fails.
Now, let’s pretend instead that you uploaded a new version of the page at 11:57. In this case, the browser’s attempt to validate the old version of the page at 12:00 fails and it receives in the response, along with the new page, these two new headers:
(The last modification time of the file becomes 11:57 upon upload of the new version, and Apache calculates the expiration time as 11:57 + 2:00 = 13:57 GMT.)
Validation (using the more recent date) will not be required now until 13:57.
(Note of course that many other things are sent along with the two headers I listed above, I just trimmed out all the rest for simplicity)