I’m trying to go through a series of numbered data pages using urlib2. What

Question

0

Asked: May 27, 20262026-05-27T03:04:31+00:00 2026-05-27T03:04:31+00:00

I’m trying to go through a series of numbered data pages using urlib2. What

0

I’m trying to go through a series of numbered data pages using urlib2. What I want to do is use a try statement, but I have little knowledge of it, Judging by reading up a bit, it seems to be based on specific ‘names’ that are exceptions, eg IOError etc. I don’t know what the error code is I’m looking for, which is part of the problem.

I’ve written / pasted from ‘urllib2 the missing manual’ my urllib2 page fetching routine thus:

def fetch_page(url,useragent)
    urlopen = urllib2.urlopen
    Request = urllib2.Request
    cj = cookielib.LWPCookieJar()

    txheaders =  {'User-agent' : useragent}

    if os.path.isfile(COOKIEFILE):
        cj.load(COOKIEFILE)
        print "previous cookie loaded..."
    else:
        print "no ospath to cookfile"

    opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
    urllib2.install_opener(opener)
    try:
        req = urllib2.Request(url, useragent)
        # create a request object

        handle = urlopen(req)
        # and open it to return a handle on the url

    except IOError, e:
        print 'Failed to open "%s".' % url
        if hasattr(e, 'code'):
            print 'We failed with error code - %s.' % e.code
        elif hasattr(e, 'reason'):
            print "The error object has the following 'reason' attribute :"
            print e.reason
            print "This usually means the server doesn't exist,",
            print "is down, or we don't have an internet connection."
            return False

    else:
        print
        if cj is None:
            print "We don't have a cookie library available - sorry."
            print "I can't show you any cookies."
        else:
            print 'These are the cookies we have received so far :'
            for index, cookie in enumerate(cj):
                print index, '  :  ', cookie
                cj.save(COOKIEFILE)           # save the cookies again

        page = handle.read()
        return (page)

def fetch_series():

  useragent="Firefox...etc."
  url="www.example.com/01.html"
  try:
    fetch_page(url,useragent)
  except [something]:
    print "failed to get page"
    sys.exit()

The bottom function is just an example to see what I mean, can anyone tell me what I should be putting there ? I made the page fetching function return False if it gets a 404, is this correct ? So why doesn’t except False: work ? Thanks for any help you can give.

ok well as per advice here ive tried:

except urlib2.URLError, e:

except URLError, e:

except URLError:

except urllib2.IOError, e:

except IOError, e:

except IOError:

except urllib2.HTTPError, e:

except urllib2.HTTPError:

except HTTPError:

none of them work.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-27T03:04:32+00:00

I recommend you check out the wonderful requests module.

With it you could achieve the functionality you are asking about like so:

import requests
from requests.exceptions import HTTPError

try:
    r = requests.get('http://httpbin.org/status/200')
    r.raise_for_status()
except HTTPError:
    print 'Could not download page'
else:
    print r.url, 'downloaded successfully'

try:
    r = requests.get('http://httpbin.org/status/404')
    r.raise_for_status()
except HTTPError:
    print 'Could not download', r.url
else:
    print r.url, 'downloaded successfully'

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to go through a series of numbered data pages using urlib2. What

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply