If I have a series of CPU-intensive operations, will multi-threading my program necessarily decrease its runtime? What are the trade-offs of doing so? In this case, I’m trying to compute the nullspace of a very large matrix. I’m using Python and, specifically, the numpy package:
def nullspace(A, eps=1e-15):
"""Computes the null space of the real matrix A."""
n, m = shape(A)
if n > m :
return nullspace(transpose(A), eps)
_, s, vh = linalg.svd(A)
s = append(s, zeros(m))[0:m]
null_mask = (s <= eps)
null_space = compress(null_mask, vh, axis=0)
return null_space.tolist()
Also, I would be interested to know just how one would go about multi-threading such a function. Thanks in advance.
Python has the Global Interpreter Lock (GIL), which only allows one thread to interact with the interpreter at a time — effectively, this means that you can only run one thread of Python at a time. This is a severe disadvantage when trying to run multiple threads.
However,
numpyis built on top of a heavily-optimised library for numerical linear algebra called LAPACK. If you install the right version of LAPACK for your system, it will parallelise its computations for you. You can then installnumpyon top of your LAPACK, and the Python computations will be parallelised.This also means that many
numpyoperations release the GIL, so that you can fire off a longnumpycomputation in a Python thread and simultaneously execute other Python. Thanks @JFSebastian.