Given that I have the following in my knowledge-database:
1 0 6 20 0 0 6 20
1 0 3 6 0 0 3 6
1 0 15 45 0 0 15 45
1 0 17 44 0 0 17 44
1 0 2 5 0 0 2 5
I want to be able to find the nearest neighbors of the following vector:
1 0 5 16 0 0 5 16
according to a distance metric. So in this case, given a particular threshold, I should find that the first vector listed is a near-neighbor to the given vector. Currently, the size of my knowledge database is in the order of millions so calculating the distance metric for each and every point and then comparing is proving expensive. Are there any alternatives on how to achieve this with a significant speedup?
I am open to pretty much any approach including using spatial indexes in MySQL (except that I am not entirely sure on how this problem can be solved) or some kind of hashing (this would be great but again, I am not entirely sure).
In Python (from http://www.comp.mq.edu.au/):
also another implementation of http://www.umanitoba.ca/