I have an android application that’s a “link” to a magazine website.
The activity of the application would be the magazine website itself.
I’ve made a widget for it and I run a service that as one sole purpose: to detect when a new magazine is online. When the service detects it changes the widget icon.
Now my doubt is how can I detect a new magazine. I was thinking about download a file from the website every 6 hours and compare the version of the last magazine (I may start with 0 as a local variable for the application and compare with the number provided by the document downloaded).
Is there a better way to do it?
It depends what you consider to be a “change”. Assuming you want to detect any changes, download the magazine homepage/other file and perform an MD5 or similar hash on it. Store the hash.
Next time you do a download, you hash it again then compare hashes. If the hashes are identical, the page is unchanged. The benefit of the hash is the reduced storage requirements – you only need to save a handful of bytes, not a whole document.
Be aware, however, that most pages are NOT static – imagine a page with a clock in the corner or any dynamic content – in this scenario, your page will always appear to be different.
For some well-run sites and servers, you may be able to look at the HTTP headers to get information about when the page was created/modified/is set to expire. This won’t be provided by everyone and can sometimes just be plain wrong.
The ideal solution is to find one particular page (or part of a page) which will onlu change once with every new issue – then you can just keep checking that one thing. An example of this might be a link that always points at the latest issue or the url for the main image which changes with each issue.
Of course, if the magazines are willing to help, they could expose the information to you in a number of ways from a simple file with just an issue number inside to a full-on webservice.
Edit: Assuming multiple magazines under your control, I’d suggest you have a single page that returns a list of the latest issues for each magazine in a readily parsable format (JSON, XML). This list could be static if issues are infrequent/a very manual process – in which case, edit it by hand. Even better would be a simple database table which is read to generate the list – This way you can have a nice UI to update it and allow someone else to maintain it without giving them access to the server file system.
I’d also suggest that you assign a truly unique id/key to each magazine and to each issue of that magazine – so that in future, you can add other functionality like downloading locally for offline reading / syncing back issues.