I am trying to learn nditer for possible use in speeding up my application. Here, i try to make a facetious reshape program that will take a size 20 array and reshape it to a 5×4 array:
myArray = np.arange(20)
def fi_by_fo_100(array):
offset = np.array([0, 4, 8, 12, 16])
it = np.nditer([offset, None],
flags=['reduce_ok'],
op_flags=[['readonly'],
['readwrite','allocate']],
op_axes=[None, [0,1,-1]],
itershape=(-1, 4, offset.size))
while not it.finished:
indices = np.arange(it[0],(it[0]+4), dtype=int)
info = array.take(indices)
'''Just for fun, we'll perform an operation on data.\
Let's shift it to 100'''
info = info + 81
it.operands[1][...]=info
it.iternext()
return it.operands[1]
test = fi_by_fo_100(myArray)
>>> test
array([[ 97, 98, 99, 100]])
Clearly the program is overwriting each result into one row. So i try using the indexing functionality of nditer, but still no dice.
flags=['reduce_ok','c_iter'] –> it.operands[1][it.index][...]=info =
IndexError: index out of bounds
flags=['reduce_ok','c_iter'] –> it.operands[1][it.iterindex][...]=info =
IndexError: index out of bounds
flags=['reduce_ok','multi_iter'] –> it.operands[1][it.multi_index][...]=info =
IndexError: index out of bounds
it[0][it.multi_index[1]][...]=info =
IndexError: 0-d arrays can't be indexed
…and so on. What am i missing? Thanks in advance.
Bonus Question
I just happened across this nice article on nditer. I may be new to Numpy, but this is the first time i’ve seen Numpy speed benchmarks this far behind. It’s my understanding that people choose Numpy for it’s numerical speed and prowess, but iteration is a part of that, no? What is the point of nditer if it’s so slow?
It really helps to break things down by printing out what’s going on along the way.
First, let’s replace your whole loop with this:
It’ll print 20, not 5. That’s because you’re doing a 5×4 iteration, not 5×1.
So, why is this even close to working? Well, let’s look at the loop more carefully:
You’ll see that the first five loops go through
[0 4 8 12 16]five times, generating[[81 82 83 84]], then[[85 86 87 88]], etc. And then the next five loops do the same thing, and again and again.This is also why your
c_indexsolutions didn’t work—becauseit.indexis going to range from 0 to 19, and you don’t have 20 of anything init.operands[1].If you did the multi_index right and ignored the columns, you could make this work… but still, you’d be doing a 5×4 iteration, just to repeat each step 4 times, instead of doing the 5×1 iteration you want.
Your
it.operands[1][...]=inforeplaces the entire output with a 5×1 row each time through the loop. Generally, you shouldn’t ever have to do anything toit.operands[1]—the whole point ofnditeris that you just take care of eachit[1], and the finalit.operands[1]is the result.Of course a 5×4 iteration over rows makes no sense. Either do a 5×4 iteration over individual values, or a 5×1 iteration over rows.
If you want the former, the easiest way to do it is to reshape the input array, then just iterate that:
But of course that’s silly—it’s just a slower and more complicated way of writing:
And it would be a bit silly to suggest that “the way to write your own
reshapeis to first callreshape, and then…”So, you want to iterate over rows, right?
Let’s simplify things a bit by getting rid of the
allocateand explicitly creating a 5×4 array to start with:This is a bit of an abuse of
nditer, but at least it does the right thing.Since you’re just doing a 1D iteration over the source and basically ignoring the second, there’s really no good reason to use
nditerhere. If you need to do lockstep iteration over multiple arrays,for a, b in nditer([x, y], …)is cleaner than iterating overxand using the index to accessy—just likefor a, b in zip(x, y)outside ofnumpy. And if you need to iterate over multi-dimensional arrays,nditeris usually cleaner than the alternatives. But here, all you’re really doing is iterating over[0, 4, 8, 16, 20], doing something with the result, and copying it into anotherarray.Also, as I mentioned in the comments, if you find yourself using iteration in
numpy, you’re usually doing something wrong. All of the speed benefits ofnumpycome from letting it execute the tight loops in native C/Fortran or lower-level vector operations. Once you’re looping overarrays, you’re effectively just doing slow Python numerics with a slightly nicer syntax:On my system, this prints:
That shows you the cost of using
nditerunnecessarily.