In my code I have for loop that indexes over a multidimensional numpy array and does some operation using the sub-array that is obtained at each iteration. It looks like this
for sub in Arr:
#do stuff using sub
Now the stuff that is done using sub is fully vectorized, so it should be efficient. On the other hand this loop iterates about ~10^5 times and is the bottleneck. Do you think I will get an improvement by offloading this part to C. I am somewhat reluctant to do so because the do stuff using sub uses broadcasting, slicing, smart-indexing tricks that would be tedious to write in plain C. I would also welcome thoughts and suggestions about how to deal with broadcasting, slicing, smart-indexing when offloading computation to C.
San you can take a look at
scipy.weave. You can usescipy.weave.blitzto transparently translate your expression intoC++code and run it. It will handle slicing automatically and get rid of temporaries, but you claim that the body of yourforloop does not create temporaries so your milage may vary.However if you want to replace your entire for loop with something more efficient then you could make use of
scipy.inline. The drawback is that you have to writeC++code. This should not be too hard because you can useBlitz++syntax which is very close to numpy array expressions. Slicing is directly supported, broadcasting however is not.There are two work arounds:
is to use the numpy-C api and use multi-dimensional iterators. They transparently handle broadcasting. However you are invoking the Numpy runtime so there might be some overhead. The other option, and possibly the simpler option is to use the usual matrix notation for broadcasting. Broadcast operations can be written as outer-products with vector of all ones. The good thing is that
Blitz++will not actually create this temporary broadcasted arrays in memory, it will figure out how to wrap it into an equivalent loop.For the second option take a look at http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC88 for index place holders. As long as your matrix has less than 11 dimensions you are fine. This link shows how they can be used to form outer products http://www.oonumerics.org/blitz/docs/blitz_3.html#SEC99 (search for outer products to go to the relevant part of the document).