I have some functional code that I’m trying to speed up by eliminating the for loop.
I have a set of data in x,y pairs as two vectors, so x(k) and y(k) form a pair. I also have a set of bin edges (xe). For every bin j, there is a set of x values in that bin, defined by xe(j) <= x(k) < xe(j+1). For each bin, I would like to find the mean and standard deviation of all y(k) with x(k) in that bin.
MATLAB code that accomplishes this is below:
[meany, standardeviation] = ystatsvsx (xdata, ydata, xe)
meany = zeros([size(ydata,1) (length(xe)-1)]);
standarddeviation = meany;
[numx,bin] = histc(xdata, xe);
for j = 1:(length(xe) - 1)
inds = bin == j;
meany(j) = mean(ydata(inds));
standarddeviation(j) = std(ydata(inds));
end
When xe is large, this function becomes slow. Does anyone have any suggestiosn about how to vectorize this code to eliminate the for loop? The number of data points in a given bin (numx) is variable.
One caveat: length(xe)*length(xdata) in these cases is very large (length(xdata) is always much larger than length(xe)), so it is not possible to use repmat to create a length(xe) x length(xdata) matrix.
You can use
accumarrayto do that. Try something like that: