System Info:
Windows 7,
GNU Wget 1.11.4,
Python 2.6
The problem:
Im running a python script that fires a wget shortcut, the problem is that wget (even when run purely in command line from the exe) cuts off ‘&”s. For example when i run the code below, this is what i get:
C:\Program Files\GnuWin32\bin>wget.exe
http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_type=featureSYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc syswgetrc = C:\Program
Files\GnuWin32/etc/wgetrc–2013-01-18 12:48:43– http://www.imdb.com/search/title?genres=action Resolving
http://www.imdb.com… 72.21.215.52 Connecting to
http://www.imdb.com|72.21.215.52|:80… failed: Connection refused.=alpha,ascThe system cannot find the file specified.
The system cannot find the file 51. ‘title_type’ is not recognized as an internal or
external command, operable program or batch file.
As you can see, wget counts all text before the ‘&’ as the URL in question, and windows take the last half as a new command(s).
There has got to be some way of allowing wget to capture that whole string as the URL.
Thanks in advance.
EDIT:
When i call the command in command line with brackets around it, it works great, however, when i run the script through python:
subprocess.Popen(['start /B wget.lnk --directory-prefix=' + output_folder + ' --output-document=' + output_folder + 'this.html "http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_type=feature"'], shell=True)
I get the following error:
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
"http://www.imdb.com/search/title?genres=action&sort=alpha,asc&start=51&title_ty
pe=feature": Unsupported scheme.
It’s not Wget that cuts off the URL, but the command interpreter, which uses
&to separate two commands, akin to;. This is indicated by the=alpha,ascThe system cannot find the file specified.error on the following line.To prevent this from happening, quote the entire URL: