I have a simple webservice that I need to scale up substantially.
I’m trying to decide where to go amongst the various web frameworks, load balancers, app servers (e.g Mongrel2, Tornado, and nginx, mod_proxy).
I have an existing Python app (currently exposed via BaseHTTPServer) that accepts some JSON data (about 900KB per request), and returns some JSON data (about 1k). The processing is algorithmic and done in a mixture of Python and some C (via Cython).
This is heavily optimized already (down to 1.1 seconds per-job from >1hour). But I can optimise that no further. While I rewrite in something a bit more thread-friendly, I need to scale things out horizontally (ec2 maybe).
There is no session or state, but the startup time of the app is quite slow (even with pickling and cashing). It takes about 3 seconds to load all the source data. Once running it takes about 1.1 seconds per request. I
Maybe I could spin up a number of copies and then reverse proxy them? Maybe I could do some funky worker pool in one of those frameworks? But I’m still in the unknown unknowns here.
First, you should decouple your webservice layer from number crunching. Use external job queue (for example http://celeryproject.org/), to offload web frontend. Then you can scale each part interdependently.
You should look for IaaS-type cloud providers (EC2, Rackspace, Linode, Softlayer etc), where you should be able to add nodes automatically (preferred way would be to spin up some preconfigured image to minimize node setup time).