What’s wrong with this snippet of code?
import numpy as np
from scipy import stats
d = np.arange(10.0)
cutoffs = [stats.scoreatpercentile(d, pct) for pct in range(0, 100, 20)]
f = lambda x: np.sum(x > cutoffs)
fv = np.vectorize(f)
# why don't these two lines output the same values?
[f(x) for x in d] # => [0, 1, 2, 2, 3, 3, 4, 4, 5, 5]
fv(d) # => array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
Any ideas?
cutoffsis a list. The numbers you extract fromdare all turned intofloatand applied usingnumpy.vectorize. (It’s actually rather odd—it looks like first it tries numpy floats that work like you want then it tries normal Python floats.) By a rather odd, stupid behavior in Python, floats are always less than lists, so instead of getting things likeyou get
To solve the problem, you can make
cutoffsa numpy array instead of alist. (You could probably also move the comparison into numpy operations entirely instead of faking it withnumpy.vectorize, but I do not know offhand.)