On the wikipedia entry for k-d trees, an algorithm is presented for doing a nearest neighbor search on a k-d tree. What I don’t understand is the explanation of step 3.2. How do you know there isn’t a closer point just because the difference between the splitting coordinate of the search point and the current node is greater than the difference between the splitting coordinate of the search point and the current best?
Nearest neighbor search Animation of
NN searching with a KD Tree in 2DThe nearest neighbor (NN) algorithm
aims to find the point in the tree
which is nearest to a given input
point. This search can be done
efficiently by using the tree
properties to quickly eliminate large
portions of the search space.
Searching for a nearest neighbor in a
kd-tree proceeds as follows:
- Starting with the root node, the algorithm moves down the tree
recursively, in the same way that it
would if the search point were being
inserted (i.e. it goes right or left
depending on whether the point is
greater or less than the current node
in the split dimension).- Once the algorithm reaches a leaf node, it saves that node point as
the “current best”- The algorithm unwinds the recursion of the tree, performing the
following steps at each node:
1. If the current node is closer than the current best, then it
becomes the current best.
2. The algorithm checks whether there could be any points on
the other side of the splitting plane
that are closer to the search point
than the current best. In concept,
this is done by intersecting the
splitting hyperplane with a
hypersphere around the search point
that has a radius equal to the current
nearest distance. Since the
hyperplanes are all axis-aligned this
is implemented as a simple comparison
to see whether the difference between
the splitting coordinate of the search
point and current node is less than
the distance (overall coordinates)
from the search point to the current
best.
1. If the hypersphere crosses the plane, there could be
nearer points on the other side of the
plane, so the algorithm must move down
the other branch of the tree from the
current node looking for closer
points, following the same recursive
process as the entire search.
2. If the hypersphere doesn’t intersect the splitting plane,
then the algorithm continues walking
up the tree, and the entire branch on
the other side of that node is
eliminated.- When the algorithm finishes this process for the root node, then the
search is complete.Generally the algorithm uses squared
distances for comparison to avoid
computing square roots. Additionally,
it can save computation by holding the
squared current best distance in a
variable for comparison.
Look carefully at the 6th frame of the animation on that page.
As the algorithm is going back up the recursion, it is possible that there is a closer point on the other side of the hyperplane that it’s on. We’ve checked one half, but there could be an even closer point on the other half.
Well, it turns out we can sometimes make a simplification. If it’s impossible for there to be a point on the other half closer than our current best (closest) point, then we can skip that hyperplane half entirely. This simplification is the one shown on the 6th frame.
Figuring out whether this simplification is possible is done by comparing the distance from the hyperplane to our search location. Because the hyperplane is aligned to the axes, the shortest line from it to any other point will a line along one dimension, so we can compare just the coordinate of the dimension that the hyperplane splits.
If it’s farther from the search point to the hyperplane than from the search point to your current closest point, then there’s no reason to search past that splitting coordinate.
Even if my explanation doesn’t help, the graphic will. Good luck on your project!