I am trying to get my script to download subtitles from http://www.subscene.com. The problem is that the download button on webpage is made in java, and for some reason i cannot download subtitles even if i extract the URL.
I think this is the code for the download button:
<a id="s_lc_bcr_downloadLink" class="downloadLink rating0" href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))">Download English Subtitle</a><a id="s_lc_bcr_previewLink" href="javascript:togglePreview(482407, 'zip');">(See preview)</a>
so i extract the url and tell my script to download it:
urllib.urlretrieve('http://subscene.com/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx','c:\\sub.zip')
(Added ‘http://subscene.com’)
But for some reason it doesnt download the right file. What am i supposed to do?
EDIT:
Thanks a lot! unfortunately i cant get it to work 🙁 it says the following
from selenium import webdriver
browser = webdriver.Firefox()
browser.execute_script('WebForm_DoPostBackWithOptions(newWebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
browser.execute_script('WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions("s$lc$bcr$downloadLink", "", true, "", "/english/How-I-Met-Your-Mother-Seventh-Season/subtitle-482407-dlpath-90698/zip.zipx", false, true))')
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\webdriver.py", line 385, in execute_script{'script': script, 'args':converted_args})['value']
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\webdriver.py", line 153, in execute
self.error_handler.check_response(response)
File "C:\Users\User\AppData\Roaming\Python\Python27\site-packages\selenium\webdriver\remote\errorhandler.py", line 126, in check_response
raise exception_class(message, screen, stacktrace)
WebDriverException: Message: ''
As John said this is not the file but javascript code. So instead of getting that file using urllib.urlretrieve, you can execute the javascript which downloads the files in turn. This can be done using selenium module –
I got this javascript snippet using Firebug.