I’m tinkering with some big-data queries in the ipython shell using the Django ORM. This is on a Debian 6 VM in VMware Fusion on OS X, the VM is allowed access 4 or 8 cores (I’ve played with the settings) of the 4-core HT i7 on the host.
When I watch the progress in top, when doing for example a ‘for result in results: do_query()’ in the python shell, it seems that python and one of the postgres processes are always co-located on the same physical CPU core – their total CPU usage never adds up to more than 100%, python is usually 65% to postgres’ 25% or so. iowait on the VM isn’t excessively high.
I’m not positive they’re always on the same core, but it sure looks it. Given how I plan to scale this eventually, I’d prefer that the python process(es) and postgress workers be scheduled more optimally. Any insight?
Right now, if your code works the way I think it works, Postgres is always waiting for Python to send it a query, or Python is waiting for Postgres to come back with a response. There’s no situation where they’d both be doing work at once, so only one ever runs at a time.
To start using your machine more heavily, you’ll need to implement some sort of multithreading on the Python end. Since you haven’t given many details on what your queries are, it’s hard to say what that might look like.