I’m using PHP to data scrape another website. However, on certain occasions I need to confirm a variable (due to have two very alike possibilities).
The button I’m supposed to click to confirm my variable is:
<input type="submit" class="buttonEmphasized confirm_nl" name="start" value="Bevestig" accesskey="s" />
However, adding &start=Bevestig to the url doesn’t seem to solve the problem, and I’m receiving the same page. What’s more, is that the website is using sessions and every http_post_data seems to be starting a new session.
Is there a way to let PHP “click” a button if a certain output is missing?
This is a train time table data scraping system (using the HAFAS system).
Cheers
there is no generalized solution for this problem. every site is different in some way. your best bet is to analyze http message being sent by the original page. you can do it with firefox+firebug+live http headers for example. this way you’re going to see all the parameters(required or not) and then replicate this message with your script.
it might(will, most likely) require faking session/cookie data. you might need to use curl for that.