I know this question has been answered before in this thread, but I couldn’t seem to find the details.
In my scenario, I am building a console application which will keep an eye on html page source for any changes. If any update/change occurs, I will perform further operations. Moreover, I’ll also perform a request after every 1 second, or as soon as the previous request finishes.
I can’t seem to figure out should I use HttpWebRequest or WebClient for downloading the html page source and perform comparison? What do you think would be an ideal solution in my case? Speed and reliability both 🙂
I’d go with
HttpWebRequstbecause it’s not as abstracted and lets you fiddle with HTTP params quite a bit. It gives you the option to not download the entire page if the server returns “file not changed”, for example.If you add some parameters to your request like
IfModifiedSince(it might be HEAD or GET request) the server may return the response code 304 – NOT MODIFIED. Refer to description of caching in HTTP for further explanation.The point is to make sure that you only download the full page when it’s actually modified since the last time you fetched it. Most of the time it won’t be changed (I suppose, can’t know for sure without knowing your domain), so you only need to get a lightweight response from server which simply states “nothing changed here”.
Update: code sample demonstrating the use of
IfModifiedSinceproperty:This method should return
trueif the page was modifed since thedateTimedate andfalseif it wasn’t.GetResponsemethod will throw aWebExceptionif you make a HEAD-request and the server returns 304 – NOT MODIFIED (which is kinda unfortunate). We have to make sure that it’s not some other web connection problem, that’s why I check the status of web exception and the HTTP status in response. If anything else caused an exception we just throw it further.This sample code produces the output:
Note: make sure to read Jim Mischel’s addition to this answer as he gives few good advices on this technique.