I have problems with my code. #!/usr/bin/env python3.1 import urllib.request; # Disguise as a

Question

0

Asked: June 1, 20262026-06-01T02:41:58+00:00 2026-06-01T02:41:58+00:00

I have problems with my code. #!/usr/bin/env python3.1 import urllib.request; # Disguise as a

0

I have problems with my code.

#!/usr/bin/env python3.1

import urllib.request;

# Disguise as a Mozila browser on a Windows OS
userAgent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)';

URL = "www.example.com/img";
req = urllib.request.Request(URL, headers={'User-Agent' : userAgent});

# Counter for the filename.
i = 0;

while True:
    fname =  str(i).zfill(3) + '.png';
    req.full_url = URL + fname;

    f = open(fname, 'wb');

    try:
        response = urllib.request.urlopen(req);
    except:
        break;
    else:
        f.write(response.read());
        i+=1;
        response.close();
    finally:
        f.close();

The problem seems to come when I create the urllib.request.Request object (called req). I create it with a non-existing url but later I change the url to what it should be. I’m doing this so that I can use the same urllib.request.Request object and not have to create new ones on each iteration. There is probably a mechanism for doing exactly that in python but I’m not sure what it is.

EDIT
Error message is:

>>> response = urllib.request.urlopen(req);
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.1/urllib/request.py", line 121, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python3.1/urllib/request.py", line 356, in open
    response = meth(req, response)
  File "/usr/lib/python3.1/urllib/request.py", line 468, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.1/urllib/request.py", line 394, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.1/urllib/request.py", line 328, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.1/urllib/request.py", line 476, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

EDIT 2: My solution is the following. Probably should have done this at the start as I knew it would work:

import urllib.request;

# Disguise as a Mozila browser on a Windows OS
userAgent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)';

# Counter for the filename.
i = 0;

while True:
    fname =  str(i).zfill(3) + '.png';
    URL = "www.example.com/img" + fname;

    f = open(fname, 'wb');

    try:
        req = urllib.request.Request(URL, headers={'User-Agent' : userAgent});
        response = urllib.request.urlopen(req);
    except:
        break;
    else:
        f.write(response.read());
        i+=1;
        response.close();
    finally:
        f.close();

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-01T02:41:59+00:00

urllib2 is fine for small scripts that only need to do one or two network interactions, but if you are doing a lot more work, you will likely find that either urllib3, or requests (which not coincidentally is built on the former), may suit your needs better. Your particular example might look like:

from itertools import count
import requests

HEADERS = {'user-agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'}
URL = "http://www.example.com/img%03d.png"

# with a session, we get keep alive
session = requests.session()

for n in count():
    full_url = URL % n
    ignored, filename = URL.rsplit('/', 1)

    with file(filename, 'wb') as outfile:
        response = session.get(full_url, headers=HEADERS)
        if not response.ok:
            break
        outfile.write(response.content)

Edit: If you can use regular HTTP authentication (for which the 403 Forbidden response strongly suggests), then you can add that to a requests.get with the auth parameter, as in:

response = session.get(full_url, headers=HEADERS, auth=('username','password))

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I have problems with my code. #!/usr/bin/env python3.1 import urllib.request; # Disguise as a

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply