I am looking to build a webapp to improve user experience in booking railway tickets in India. The API is impossible to get due to hefty charge to procure it. I have seen many apps that provide details of the trains etc through their apps.
My Question is how are they scraping data from the website.In general how can I legally get data shown to user (I don’t want payment and stuff that are impossible without API) on any website. How do people scrape such data? Any tools/methods?
Bear with me if question is naive. I’m pretty new to this stuff.
They can get the train schedule information using any one of several programming languages though it is most likely done with ordinary PHP and any good webserver host. For example all indian train schedules can be found on the indianrail.gov website.
Sending a specially built URL to ..
using the POST method of sending form data should give you all the details for train number 1123 After that it becomes just a simple task of tidying up the results for storage in a database.
Update: well armoured site its checking both the user agent and referer of inbound requests.
Ammendum: the indianrail.gov site is changing to http://www.trainenquiry.com/ -> will have to take another look