It was my understanding that np.apply_over_axis was a viable substitution for iterating over Numpy

Question

0

Asked: June 17, 20262026-06-17T13:13:47+00:00 2026-06-17T13:13:47+00:00

It was my understanding that np.apply_over_axis was a viable substitution for iterating over Numpy

0

It was my understanding that np.apply_over_axis was a viable substitution for iterating over Numpy arrays, as doing it the python-way has bottlenecking that makes things slower; however, it seems that iterating takes ~9% the time apply_over_axis does! Piggybacking on this previous post, i decided to do a quick timing for myself:

import numpy as np
import timeit

def triv():
     ial = [i for i in xrange(100)]

def super(fluous):
     return fluous

>>> print (timeit.timeit("triv()", setup="from __main__ import triv"))
12.3305490909
>>> print (timeit.timeit("np.apply_along_axis(super, 0, np.arange(100))", setup="from __main__ import np, super"))
130.721563921

Why is this the case? I don’t really know the intricacies of timeit that well (or much of anything about timeit for that matter), but i think my examples are straightforward enough. I was wondering if anyone found a good workaround, as my real-world problem of iterating over the Cartesian-product of the rows in very large arrays is so slow it’s impeding progress.

Thanks in advance.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-17T13:13:49+00:00

In the first place, your example is strange because apply_over_axis on a 1D array will just get passed the whole array as a single argument. It isn’t called repeatedly for each element. For that you’d want vectorize.

More generally, though, numpy can’t really speed up the application of arbitrary Python functions. The main advantage of numpy is that it provides its own implementations of lots of mathematical functions, and those are fast. It can’t magically take any function and just make it go faster.

In addition, your examples aren’t exactly parallel. You’re not calling the same function in both, for one thing. More specifically, your tests both include creating the input in the timed test — that is, you create xrange(100) and np.arange(100) inside the timed part of the test. So part of what you’re measuring is that it’s slower to create a Numpy array than to create an xrange object:

>>> timeit.timeit("np.arange(100)", "import numpy as np")
0.95243776
>>> timeit.timeit("xrange(100)", "import numpy as np")
0.2129926399999995

That’s roughly a factor of 5 right there. But in a real application, you’d almost certainly already have the input array created, so this isn’t a realistic test.

Using parallel tests, I find that the plain-list version is only about twice as fast:

def doNothing(crud):
    return crud

>>> timeit.timeit("np.vectorize(doNothing)(thing)", "import numpy as np; from __main__ import doNothing; thing = np.arange(100)")
49.13036320000003
>>> timeit.timeit("[doNothing(x) for x in thing]", "import numpy as np; from __main__ import doNothing; thing = np.arange(100)")
25.873566400000072

Moreover, the numpy version can be substantially faster if what you’re applying is a numpy function:

>>> timeit.timeit("np.log(thing)", "import numpy as np; import math; from __main__ import doNothing; thing = np.arange(100)+1")
3.2039433600000393
>>> timeit.timeit("[math.log(x) for x in thing]", "import numpy as np; import math; from __main__ import doNothing; thing = np.arange(100)+1")
37.74219519999997

The moral of the story is that numpy really is NUMpy — it’s made for doing numerical calculations, and it has functions for doing them fast. It’s not just a thing that speeds up all your loops. If you just have big arrays of objects that you’re applying arbitrary functions to, numpy is unlikely to speed up your code, and may even slow it down. (It can still be very useful for non-numeric data because its facilities for things like complicated indexing into multidimensional arrays are convenient and may be faster than an equivalent Python structure using nested lists or the like. It’s just not useful for speeding up loops over these structures.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

It was my understanding that np.apply_over_axis was a viable substitution for iterating over Numpy

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply