I’m using numpy, in particular the histrogram2d function.
I am binning a 3D spatial distribution of points (arrays x,y and z) with a 2d histogram. For each point I have an associated density field d.
If I do something like that
import numpy as np
H, xedges, yedges = np.histogram2d(x,y,bins=200,weights=d)
The histogram H represent the sum of the density along the line-of-sight (in this case in the z-axis). This is pretty fast and easy considering that I’m working with very big arrays.
Now I want to go further and instead of plotting the sum of the density filed along the line-of-sight I would like to get the maximum of the density in each 2D bin.
I coded the possible solution:
from numpy import *
x=array([0.5,0.5,0.2,0.3,0.2,0.25,0.35,0.6,0.1,0.22,0.7,0.45,0.57,0.65])
y=array([0.5,0.5,0.28,0.18,0.85,0.9,0.44,0.7,0.1,0.22,0.7,0.45,0.54,0.65])
d=array([1,1,2,2,3,5,6,8,7,9,6,10,5,7])
bins=linspace(0,1,64)
idx=digitize(x,bins)
idy=digitize(y,bins)
img2=zeros((len(bins),len(bins)))
for i in arange(0,len(d)):
dummy=idx[i]
dummy2=idy[i]
img2[dummy][dummy2]=max(d[i],img2[dummy][dummy2])
However the loop in the last lines might be really slow for a huge dataset. Any idea on how I can make it faster?
Here is how I would do it, sorry I don’t have time to write up the code right now:
numpy.ravel_multi_indexto turn the 2d problem into a 1d problem.numpy.unique, you want to do something like that to get unique bin values, but you want to do it in such a way so that it also gives you the min/max ofdat the same time.numpy.lexsortmight also help here.img2.flat[uniq_1d_bin_value] = bin_maxI hope that’s enough to get you started. If you have trouble, you can post your code and let us know where you got stuck and maybe I, or someone else, can help put you on the right path again.