One of my clients needs to pull customer data from a web service periodically. The data itself is provided as CSV files via javascript postbacks, as is most of the navigation within the service’s website.
Right now, the worst bottleneck in the entire system is the need for a human to log into the web page, navigate to the download page, and manually add the downloaded file to the rest of the system.
Can the process of downloading files via postback be automated? (Say for example via a shell script that can be run via cron?)
If so, what would you recommend as the most appropriate tool for doing so?
In case anyone comes across this question again, I’ve found a solution:
The trick is to use Mechanize, and a series of calls to
Browser.submit()on the appropriate pages.The one hangup that others might also encounter is that ASP.NET pages (the biggest source of postback-based navigation, in my experience) also need a hidden parameter called
__EVENTTARGETin the form, which won’t exist when you use mechanize.The
__doPostBack('foo')function on these pages gives the relevant value to__EVENTTARGETvia a javascript onclick event on each of the links, but since mechanize doesn’t use javascript you’ll need to set these values yourself.I made up a quick little utility function to use within my scripts that does this:
I hope this is helpful for anyone who maybe comes across this question later.