Here’s a description:
It operates like a regular map with get, put, and remove methods, but has a getTopKEntries(int k) method to get the top-K elements, sorted by the key:
For my specific use case, I’m adding, removing, and adjusting a lot of values in the structure, but at any one time there’s approximately 500-1000 elements; I want to return the entries for the top 10 keys efficiently.
- I call the
putandremovemethods many times. - I call the
getTopKEntriesmethod. - I call the
putandremovemethods some more times. - I call the
getTopKEntriesmethod. - …
I’m hoping for O(1) get, put, and remove operations, and for getTopKEntries to be dependent only on K, not on the size of the map.
So what’s a data structure for efficiently returning the top-K elements of a map?
My other question is similar, but is for the case of returning all elements of a map, sorted by the key.
If it helps, both the keys and values are 4-byte integers.
A binary search tree (i.e.
std::mapin C++) sounds like the perfect structure: it’s already lexicographically ordered, i.e. a simple in-order traversal will yield the elements in ascending order. Hence, iterating over the first k elements will yield the top k elements directly.Additionally, since you foresee a lot of “remove” operations, a hash table won’t be well-suited anyway: remove operations destroy the load factor characteristics of hash tables which leads to a rapid deterioration of the runtime.