I’m having trouble handling a certain redirect with Python. I’m requesting a page that

Question

0

Asked: June 15, 20262026-06-15T21:57:45+00:00 2026-06-15T21:57:45+00:00

I’m having trouble handling a certain redirect with Python. I’m requesting a page that

0

I’m having trouble handling a certain redirect with Python. I’m requesting a page that apparently loads and immediately redirects to ww1.www.com. I’m assuming this is the case because I’ve tried every method I know of returning headers/status codes and always end up with appropriate results (status code: 200, appropriate hosts/referrer params, etc).

Here is what I have:

from BeautifulSoup import BeautifulSoup
import urllib
import psycopg2
import psycopg2.extras

db = psycopg2.connect(
                     host = 'myIP'
                     database = 'myDATABASE'
                     user = 'myUSERNAME'
                     password = 'myPASSWORD'
                     )

cursor = db.cursor(cursor_factory = psycopg2.extras.RealDictCursor)
cursor.execute("SELECT info FROM table")

for row in cursor:
    url = 'http://www.website.com/' + row['info']
    file_pointer = urllib.urlopen(url)
    html_object = BeautifulSoup(file_pointer)

    if file_pointer.getcode() != 200:
        continue

The if statement should prevent any further code from being executed if the status code does not equal 200, however I get Index Errors in code after this section, and after investigating the url that provides the error, I find that it redirects without giving me a status code: 302.

Any thoughts as to why I would be getting a 200 status code response while still redirecting? (I’ve also tried equivalents with urllib2 and httplib) Also, how can I prevent this from happening?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-15T21:57:47+00:00

one thing that doesn’t look right

html_object = BeautifulSoup(file_pointer) should operate on the data from urlopen, not the handle:- so – html_object = BeautifulSoup(file_pointer.read()) is what’s wanted here…

for debugging

Install requests if you haven’t already – it’s a great library to use for these kind of things.

Then:

import requests
for row in cursor:
    page = requests.get('your url')
    for hist in page.history:
        print hist.status_code, hist.url

And see if that throws out anything that’s puzzling…

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m having trouble handling a certain redirect with Python. I’m requesting a page that

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply