In my application, I fetch webpages periodically using LWP. Is there anyway to check whether between two consecutive fetches the webpage has got changed in some respect (other than explicitly doing a comparison) ? Is there any signature(say CRC) that is being generated at lower protocol layers which can be extracted and compared against older signatures to see possible changes ?
In my application, I fetch webpages periodically using LWP. Is there anyway to check
Share
There are two possible approaches. One is to use a digest of the page, e.g.
Another option is to use an HTTP ETag, if the server provides one for the resource requested. You can simply store it and then set your request headers to include an
If-None-Matchfield on subsequent requests. If the server ETag has remained the same, you’ll get a304 Not Modifiedstatus and an empty response body. Otherwise you’ll get the new page. (And new ETag.) See Entity Tags in RFC2616.Of course, the server could be lying, and sending the same ETag even though the content has changed. There’s no way to know unless you look.