I’m building a website using pyramid, and I want to fetch some data from other websites. Because there may be 50+ calls of urlopen, I wanted to use gevent to speed things up.
Here’s what I’ve got so far using gevent:
import urllib2
from gevent import monkey; monkey.patch_all()
from gevent import pool
gpool = gevent.pool.Pool()
def load_page(url):
response = urllib2.urlopen(url)
html = response.read()
response.close()
return html
def load_pages(urls):
return gpool.map(load_page, urls)
Running pserve development.ini --reload gives:
NotImplementedError: gevent is only usable from a single thread.
I’ve read that I need to monkey patch before anything else, but I’m not sure where the right place is for that. Also, is this a pserve-specific issue? Will I need to re-solve this problem when I move to mod_wsgi? Or is there a way to handle this use-case (just urlopen) without gevent? I’ve seen suggestions for requests but I couldn’t find an example of fetching multiple pages in the docs.
Update 1:
I also tried eventlet from this SO question (almost directly copied from this eventlet example):
import eventlet
from eventlet.green import urllib2
def fetch(url):
return urllib2.urlopen(url).read()
def fetch_multiple(urls):
pool = eventlet.GreenPool()
return pool.imap(fetch, urls)
However when I call fetch_multiple, I’m getting TypeError: request() got an unexpected keyword argument 'return_response'
Update 2:
The TypeError from the previous update was likely from earlier attempts to monkeypatch with gevent and not properly restarting pserve. Once I restarted everything, it works properly. Lesson learned.
There are multiple ways to do what you want:
geventthread, and explicitly dispatch all of your URL-opening jobs to that thread, which will then do the geventedurlopenrequests.gevent, one that doesn’t work by magically greenletifying your code.pycurl.geventtoo, or find some other framework that works for both your web-serving and your web-client needs.You could simulate the last one without changing frameworks by loading
geventfirst, and have it monkeypatch your threads, forcing your existing threaded server framework to become ageventserver. But this may not work, or mostly work but occasionally fail, or work but be much slower… Really, using a framework designed to begevent-friendly (or at least greenlet-friendly) is a much better idea, if that’s the way you want to go.You mentioned that others had recommended
requests. The reason you can’t find the documentation is that the built-in async code inrequestswas removed. See, an older version for how it was used. It’s now available as a separate library,grequests. However, it works by implicitly wrappingrequestswithgevent, so it will have exactly the same issues as doing so yourself.(There are other reasons to use
requestsinstead ofurllib2, and if you want togeventit it’s easier to usegrequeststhan to do it yourself.)