A web server responds to a POST request with a file to download (has Content-Disposition header). Using urllib or mechanize opener at what point will the response body be downloaded?
opener = mechanize.build_opener(HTTPRefererProcessor, HTTPEquivProcessor, HTTPRefreshProcessor)
r = make_post_request() # makes Request object to send
res = opener.open(r)
info = response.info()
content_disp = info.getheader('content-disposition')
filename = content_disp.split('=')[1]
content = res.read() # or skip based on filename
I was under the impression that the body won’t download until read(), which would be useful for skipping certain download (such as files already downloaded) but I am not seeing great deal of performance improvement.
Well, when you just want headers, you should be using HTTP HEAD. POST and GET will by definition return content.
In terms of stopping the download, the web server won’t wait to start sending you data, and everything from Python to your network card will start receiving and buffering the data immediately.
So your best bet is to find a better way of doing this — HTTP HEAD for example. If that’s not an option, call close() on your request object immediately after getting whatever headers you need and hope you didn’t waste too much bandwidth.
(And for an example on using HTTP HEAD in Python, see this answer from a while ago.)