I am working on a piece of 3D software that has sometimes has to perform intersections between massive numbers of curves (sometimes ~100,000). The most natural way to do this is to do an N^2 bounding box check, and then those curves whose bounding boxes overlap get intersected.
I heard good things about octrees, so I decided to try implementing one to see if I would get improved performance.
Here’s my design:
Each octree node is implemented as a class with a list of subnodes and an ordered list of object indices.
When an object is being added, it’s added to the lowest node that entirely contains the object, or some of that node’s children if the object doesn’t fill all of the children.
Now, what I want to do is retrieve all objects that share a tree node with a given object. To do this, I traverse all tree nodes, and if they contain the given index, I add all of their other indices to an ordered list.
This is efficient because the indices within each node are already ordered, so finding out if each index is already in the list is fast. However, the list ends up having to be resized, and this takes up most of the time in the algorithm. So what I need is some kind of tree-like data structure that will allow me to efficiently add ordered data, and also be efficient in memory.
Any suggestions?
SortedDictionary(.NET 2+) orSortedSet(.NET 4 only) is probably what you want. They are tree structures.SortedListis a dumb class which is no different fromListstructurally.However, it is still not entirely clear to me why you need this list as sorted.
Maybe if you could elaborate on this matter we could find a solution where you don’t need sorting at all. For example a simple
HashSetcould do. It is faster at both lookups and insertions thanSortedListor any of the tree structures if hashing is done properly.Ok, now when it is clear to me that you wanted sorted lists merging, I can try to write an implementation.
At first, I implemented merging using
SortedDictionaryto store heads of all the arrays. At each iteration I removed the smallest element from the dictionary and added the next one from the same array. Performance tests showed that overhead of SortedDictionary is huge, so that it is almost impossible to make it faster than simple concatenation+sorting. It even struggles to matchSortedListperformance on small tests.Then I replaced
SortedDictionarywith custom-made binary heap implementation. Performance improvement was tremendous (more than 6 times). This Heap implementation even manages to beat.Distinct()(which is usually the fastest) in some tests.Here is my code: