I’m looking to regularly download XML files from remote locations, analyse and store the results in a database. It will be running on my dedicated Linux server, however, I am not sure which is the most efficient way of doing this is as I’m not sure of the overheads of different languages.
I have looked at some options and I could either download and analyse them completely in PHP, Perl, Python or C, or use a combination (one to download with little overheads, one to analyse, one to store in database). What would be the best option / combination?
Cheers for any help in advance.
As a (very) general rule of thumb, C will have the least overhead and will be fastest. (Because it’s compiled, not interpreted.)
That being said, that difference generally isn’t noticeable. Unless you’re dealing with seriously massive XML documents, you’re talking milliseconds. The design of the XML library you choose to use, not the language, will have a far larger impact.
I think this is a case of premature optimization. Do you know in advance that your XML files are huge? Pick the language you like. If you do run into trouble, then you can export the bottleneck into another language.
My guess is the bottleneck will be your network connection, not parsing/analyzing/storing.