I have a database of animals, each with many attributes ranging from 0 to 1– these attributes are things like size, speed, hairiness, etc. Given an input set of attributes, and weights for each type of attribute, I need to find the “closest” match in the set of animals. Is there an algorithm that accomplishes this in better than O(n) time?
What I’m specifically trying to do is find suitable textures for “animals” produced by a genetic algorithm in a game, by matching them to animals that already exist. By “closest,” I mean the animal whose weighted sum of attribute differences is minimal. The database and weights are known at application launch time, so a lot of time can be invested towards preparing the data.
I’ve found algorithms on string matching and product matching given user preferences, but either I’m not finding what I’m looking for or I’m not understanding how to reapply such concepts to my dilemma. Perhaps there’s something from the world of graph theory to help me out?
Any help would be greatly appreciated!
You could treat the items as points in a high-dimensional space, and insert them all into a BSP-tree, such as a k-d tree. To use the attribute-weights, you just need to multiply them by the corresponding coordinate:
(w1*x, w2*y, ...)Preparation: (from wikipedia, python code)
Search: (from gist, based on the wikipedia algorithm)
Read more: