I am curretnly attempting to write a script in python that allows me to send a get request to a webpage (using the python requests module) and then parsing the response with the module beautiful soup.
The problem I am running into is that the table I am trying to get gets loaded/created via a javascript after the initial DOM loads therefore the response to my get request does not contain it.
Two possible things you could do, which depends on your problem.
1. Get the table directly
If you actually want to get the table, check what request is issued to get the table. For example, you can use Firebug or Chrome Developer Tools to get the right structure for the request.
2. The Javascript is important
If it is more important to support many websites, and it’s important that the javascript does some magic, you could use something like Selenium to use a Brwoser which excecutes the Javascript and you can get the source after the table is loaded.
Update – based on your URL in your comment
You can see on the
NetworkTab in the Chrome Developer Tool, that it loads very long for this request:http://www.ticketmaster.com/json/browse/music?select=n93
So we assume that this loads your data. Open the URL in your browser, and you see the data of your table is there in the JSON Format.
If you only want to parse/get this table and nothing generic for a lot of pages, I would just GET the data with this aproach.
Update
Try to change the table with the filter or the day range. You can examine how the API works and issue the request in the way you want.
filter for Dance/Electronic in the Next 7 Days:
/json/browse/music?g=Dance%2FElectronic&select=n7There is also another API call:
http://www.ticketmaster.com/json/browse/music/histogram?select=n7
But I can’t tell you what it’s for. But I think you have now a good direction and more time than me to understand how it works 😉
Tools
The tool I’ve used to find the URLs is the built in Chrome Developer Tool with the
Networktab. Activate the tool, refresh the page and tinker in the requests to understand whats happening.It’s also very easy to parse JSON with python: http://docs.python.org/library/json.html