I have a three dimensional ndarray of 2D coordinates, for example:
[[[1704 1240]
[1745 1244]
[1972 1290]
[2129 1395]
[1989 1332]]
[[1712 1246]
[1750 1246]
[1964 1286]
[2138 1399]
[1989 1333]]
[[1721 1249]
[1756 1249]
[1955 1283]
[2145 1399]
[1990 1333]]]
The ultimate goal is to remove the point closest to a given point ([1989 1332]) from each “group” of 5 coordinates. My thought was to produce a similarly shaped array of distances, and then using argmin to determine the indices of the values to be removed. However, I am not certain how to go about applying a function, like one to calculate a distance to a given point, to every element in an ndarray, at least in a NumPythonic way.
List comprehensions are a very inefficient way to deal with numpy arrays. They’re an especially poor choice for the distance calculation.
To find the difference between your data and a point, you’d just do
data - point. You can then calculate the distance usingnp.hypot, or if you’d prefer, square it, sum it, and take the square root.It’s a bit easier if you make it an Nx2 array for the purposes of the calculation though.
Basically, you want something like this:
This yields:
Now, removing the closest element is a bit harder than simply getting the closest element.
With numpy, you can use boolean indexing to do this fairly easily.
However, you’ll need to worry a bit about the alignment of your axes.
The key is to understand that numpy “broadcasts” operations along the last axis. In this case, we want to brodcast along the middle axis.
Also,
-1can be used as a placeholder for the size of an axis. Numpy will calculate the permissible size when-1is put in as the size of an axis.What we’d need to do would look a bit like this:
You could make that a single line, I’m just breaking it down for readability. The key is that
dist != somethingyields a boolean array which you can then use to index the original array.So, Putting it all together:
Yields:
On a side note, if more than one point is equally close, this won’t work. Numpy arrays have to have the same number of elements along each dimension, so you’ll need to re-do your grouping in that case.