I have two numpy arrays X and W each with shape (N,N) that result from the end of a calculation. Subdivide the range of X into equal intervals [min(X), min(X)+delta, min(X)+2*delta,..., max(X)]. I’d like to know, given an interval starting point v, the total of the corresponding W values:
idx = (X>=v) & (X<(v+delta))
W[idx].sum()
I need this sum for all starting intervals (ie. the entire range of X) and I need to do this for many different matrices X and W. Profiling has determined that this is the bottleneck. What I’m doing now amounts to:
W_total = []
for v0, v1 in zip(X, X[1:]):
idx = (X>=x0) & (X<x1)
W_total.append( W[idx].sum() )
How can I speed this up?
You can use
numpy.histogram()to compute all those sums in a single operation: