A point in 3-d is defined by (x,y,z). Distance d between any two points (X,Y,Z) and (x,y,z) is d= Sqrt[(X-x)^2 + (Y-y)^2 + (Z-z)^2].
Now there are a million entries in a file, each entry is some point in space, in no specific order. Given any point (a,b,c) find the nearest 10 points to it. How would you store the million points and how would you retrieve those 10 points from that data structure.
A point in 3-d is defined by (x,y,z). Distance d between any two points
Share
Million points is a small number. The most straightforward approach works here (code based on KDTree is slower (for querying only one point)).
Brute-force approach (time ~1 second)
Run it:
Here’s the script that generates million 3D points:
Output:
You could use that code to test more complex data structures and algorithms (for example, whether they actually consume less memory or faster then the above simplest approach). It is worth noting that at the moment it is the only answer that contains working code.
Solution based on KDTree (time ~1.4 seconds)
Run it:
Partial sort in C++ (time ~1.1 seconds)
Run it:
Priority Queue in C++ (time ~1.2 seconds)
Run it:
Linear Search -based approach (time ~1.15 seconds)
Measurements shows that most of the time is spent reading array from the file, actual computations take on order of magnitude less time.