Does urllib2 fetch the whole page when a urlopen call is made?
I’d like to just read the HTTP response header without getting the page. It looks like urllib2 opens the HTTP connection and then subsequently gets the actual HTML page… or does it just start buffering the page with the urlopen call?
import urllib2
myurl = 'http://www.kidsidebyside.org/2009/05/come-and-draw-the-circle-of-unity-with-us/'
page = urllib2.urlopen(myurl) // open connection, get headers
html = page.readlines() // stream page
Use the
response.info()method to get the headers.From the urllib2 docs:
So, for your example, try stepping through the result of
response.info().headersfor what you’re looking for.Note the major caveat to using httplib.HTTPMessage is documented in python issue 4773.