I have a 2D numpy array I want to find the ‘every’ location of all the unique elements. We can find the unique elements using numpy.unique(numpyarray.). Here it comes the tricky part. Now I have to know all the locations for every unique element. Lets consider the following example.
array([[1, 1, 2, 2],\
[1, 1, 2, 2],\
[3, 3, 4, 4],\
[3, 3, 4, 4]])
The result should be
1, (0,0),(1,1)
2, (0,2),(1,2)
3, (2,0),(3,1)
4, (2,2),(3,3)
How to do it and what could be a suitable way to store and iterate over the values.
It is to be noted that all the unique values will be adjacent to each other. The only gaps between them could only be zeros. Lets consider another variant
array([[1, 0, 1, 2, 2],\
[1, 0, 1, 2, 2],\
[3, 0, 3, 4, 4],\
[3, 0, 3, 4, 4]])
The result should be
1, (0,0),(1,2)
2, (0,3),(1,4)
3, (2,0),(3,2)
4, (2,3),(3,4)
The zeoros on the boundaries are to be neglected.
thanks a lot
The simple, brute force way to do it is to just use
numpy.where.For example, if you’re just wanting the bounding box:
This will work for the example with zeros, as well.
If the array is large, and you already have
scipyaround, you might consider usingscipy.ndimage.find_objectsinstead, as @unutbu suggested.In the particular case of your example, where your unique values are sequential integers, you can use
find_objectsdirectly. It expects an array where each sequential integer other than 0 represents an object that it needs to return the bounding box of. (0 is ignored, exactly as you want.) However, in general, you’ll need to do a touch of pre-processing to convert arbitrary unique values to sequential integers.find_objectsretuns a list of tuples ofsliceobjects. Honestly, these are probably exactly what you want, if you’re wanting the bouding box. However, it will look a bit more messy to print out starting and stopping indicies.This will look slightly different than you might expect. These are
sliceobjects, so the “max” value will always be one higher than the “max” in the previous example. This is so that you can simply slice with the given tuple to get the data in question.E.g.
If you really want the starts and stops, just do something like this:
If your unique values are not sequential integers, you’ll need to do a bit of pre-processing, as I mentioned before. You might do something like this: