Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7180399
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 28, 20262026-05-28T17:20:20+00:00 2026-05-28T17:20:20+00:00

Iam trying to follow the multithreading example given in: Python urllib2.urlopen() is slow, need

  • 0

Iam trying to follow the multithreading example given in:
Python urllib2.urlopen() is slow, need a better way to read several urls but I seem to get a “thread error” and I am not sure what this really means.

urlList=[list of urls to be fetched]*100
def read_url(url, queue):
 my_data=[]
 try:
    data = urllib2.urlopen(url,None,15).read()
    print('Fetched %s from %s' % (len(data), url))
    my_data.append(data)
    queue.put(data)
except HTTPError, e:
    data = urllib2.urlopen(url).read()
    print('Fetched %s from %s' % (len(data), url))
    my_data.append(data)
    queue.put(data)

def fetch_parallel():
    result = Queue.Queue()
    threads = [threading.Thread(target=read_url, args = (url,result)) for url in urlList]
    for t in threads:
      t.start()
    for t in threads:
      t.join()
    return result

res=[]  
res=fetch_parallel()
reslist = []
while not res.empty: reslist.append(res.get())
print (reslist)

I get the following first error:

Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 505, in run
self.__target(*self.__args, **self.__kwargs)
File "demo.py", line 76, in read_url
print('Fetched %s from %s' % (len(data), url))
TypeError: object of type 'instancemethod' has no len()

On the other hand, I see that sometimes, it does seem to fetch data, but then I get the following second error:

Traceback (most recent call last):
File "demo.py", line 89, in <module>
print str(res[0])
AttributeError: Queue instance has no attribute '__getitem__'

When it fetches data, why is the result not showing up in res[]? Thanks for your time.

Update After changing read to read() in the read_url() function, although the situation has improved (I now get many page fetches), but still got the error:

Exception in thread Thread-86:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 505, in run
self.__target(*self.__args, **self.__kwargs)
File "demo.py", line 75, in read_url
data = urllib2.urlopen(url).read()
File "/usr/lib/python2.7/urllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "/usr/lib/python2.7/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 429, in error
result = self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 605, in http_error_302
return self.parent.open(new, timeout=req.timeout)
File "/usr/lib/python2.7/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib/python2.7/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib/python2.7/urllib2.py", line 435, in error
return self._call_chain(*args)
File "/usr/lib/python2.7/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 518, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 502: Bad Gateway
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-28T17:20:21+00:00Added an answer on May 28, 2026 at 5:20 pm

    Note that urllib2 is not thread-safe. Therefore, you should really use urllib3.

    Some of your problems are entirely unrelated to threading. Threads just make the error reporting more complex. Instead of

    data = urllib2.urlopen(url).read
    

    you want

    data = urllib2.urlopen(url).read()
    #                               ^^
    

    A 502 Bad gateway error indicates a server misconfiguration (most likely, an internal server of the web service you’re connecting to is rebooting / not available). There’s nothing you can do about it – the URL is just not reachable right now. Use try..except to handle these errors, for example by printing a diagnostic message, or scheduling the URL to be retrieved after an appropriate waiting period, or by leaving out the failed data set.

    To get the values from the queue, you can do the following:

    res = fetch_parallel()
    reslist = []
    while not res.empty():
      reslist.append(res.get_nowait()) # or get, doesn't matter here
    print (reslist)
    

    There is also no way around real error handling in case a URL is really unreachable. Simply re-requesting it might work in some cases, but you must be able to handle the case that the remote host is truly unreachable at this time. How you do that depends on your application’s logic.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I am trying to follow this example (from p137 of Rob Pickering's Foundations of
I am trying to follow this example from MSDN: http://msdn.microsoft.com/en-us/library/a1hetckb.aspx I think i'm doing
I am trying to follow the short example in the following answer on using
I am trying to follow the example here ...but get the exception: 'System.Uri' does
I am trying to follow this example: Creating Model Classes with the Entity Framework
I am trying to follow this example but I can't understand this part: Imagine
I am trying to follow the setup on http://hide1713.wordpress.com/2009/01/30/setup-perfect-python-environment-in-emacs/ I have steps 1 -
I am trying to follow this example: http://msdn.microsoft.com/en-us/library/aa179614%28SQL.80%29.aspx# It says to add the following
I am trying to follow examples given in various places for D apps. Generally
I am trying to follow the mantra of select only what you need when

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.