Is there anything I can do to speed up masked arrays in numpy? I had a terribly inefficient function that I re-wrote to use masked arrays (where I could just mask rows instead of make copies and delete rows as I was doing). However, I was shocked to find that the masked function was 10x slower because the masked arrays are so much slower.
As an example, take the following (masked is more then 6 times slower for me):
import timeit
import numpy as np
import numpy.ma as ma
def test(row):
return row[0] + row[1]
a = np.arange(1000).reshape(500, 2)
t = timeit.Timer('np.apply_along_axis(test, 1, a)','from __main__ import test, a, np')
print round(t.timeit(100), 6)
b = ma.array(a)
t = timeit.Timer('ma.apply_along_axis(test, 1, b)','from __main__ import test, b, ma')
print round(t.timeit(100), 6)
I have no idea why the masked array functions are moving so slowly, but since it sounds like you are using the mask to select rows (as opposed to individual values), you can create a regular array from the masked rows and use the np function instead:
Better yet, don’t use masked arrays at all; just maintain your data and a 1D mask array as separate variables: