I have a huge data set and I have to compute for every point of it a series of properties. My code is really slow and I would like to make it faster parallelizing somehow the do loop. I would like each processor to compute the “series of properties” for a limited subsample of my data and then join all the properties together in one array.
I’ll try explain what I have to do with an example.
Let’s say that my data set is the array x:
x = linspace(0,20,10000)
The “property” I want to get is, for instance, the square root of x:
prop=[]
for i in arange(0,len(x)):
prop.append(sqrt(x[i]))
The question is how can I parallelize the above loop? Let’s assume I have 4 processor and I would like each of them to compute the sqrt of 10000/4=2500 points.
I tried looking at some python modules like multiprocessing and mpi4py but from the guides I couldn’t find the answer to such a simple question.
EDITS
I’ll thank you all for the precious comments and links you provided me. However, I would like to clarify my question. I’m not interested in the sqrt function whatsoever.
I am doing a series of operations within a loop. I perfectly know loops are bad and vectorial operation are always preferable to them but in this case I really have to do a loop. I won’t go into the details of my problem because this would add an unnecessary complication to this question.
I would like to split my loop so that each processor does a part of it, meaning that I could run my code 40 times with 1/40 of the loop each and the merger the result but this would be stupid.
This is a brief example
for i in arange(0,len(x)):
# do some complicated stuff
What I want is use 40 cpus to do this:
for npcu in arange(0,40):
for i in arange(len(x)/40*ncpu,len(x)/40*(ncpu+1)):
# do some complicated stuff
Is that possible or not with python?
I’m not sure that this is the way that you should do things as I’d expect numpy to have a much more efficient method of going about it, but do you just mean something like this?
Here are the results of
timeiton both solutions. As @SvenMarcach points out, however, with a more expensive function multiprocessing will start to be much more effective.At Sven’s request, here is the result of
l = numpy.sqrt(x)which is significantly faster than either of the alternatives.