I have encountered a question in my Data Structures homework that neither me neither anyone of my colleagues could figure out, we even have no idea of where to start!
The question states that we should suggest an enhancement to B-Tree; A function order(k) – where k is a key in the B-Tree – that would display in O(log n) the key’s place in the sorted order of all the keys in the B-Tree.
We also need to show that the “enhancement” does not affect the complexity of the regular abstract functions of the B-Tree.
We can use O(n) extra space, where n is the number of keys in the B-Tree.
Further explanation: Take for example a B-Tree that has the keys A B C D E F G H I J K L M N.
- order(A) result should be “1”.
- order(N) result should be “14”.
- order(I) result should be “9”.
What I have figured out so far:
-
Given that we are allowed to use an O(n) extra space, and that the B-Tree regular space is O(n), we should -probably- use an extra B-Tree for the help.
-
The fact that they mentioned that we should show that the enhancement does not affect the complexity of the regular B-Tree functions, at some point we have to manipulate the regular abstract B-Tree functions in some way, in a way that does not affect their regular complexity.
-
The fact that we have to order(k) in O(log n) suggests that we should go through the B-Tree in a height-based way, not node by node.
-
Somewhere, probably, we have to check if the given k in order(k) actually exists in the B-Tree, I suggest the regular abstract search function of the B-Tree.
At each key, you should store extra data that records how many keys are under that node (including in the node itself).
To maintain this, the insert(k) function would have to travel back up through all of the ancestors of the new key, k, and increment their values. This would make insert O(log n) + O(log n), which is still O(log n), and thus does not affect the complexity. The delete(k) would have to do the same thing, except decrement the values. Balancing operations would also have to take this into account.
Then, order(k) would travel down the tree to k: each time it travels to a node, it should add the count of how many keys are to the left side, to the total, and return this sum.
EDIT: I changed the ambiguity of “node” between node and key, as these are different in a B-tree (a node can contain multiple keys). However, the algorithm should generalize to most tree-data-structures.
This is the algorithm for the B-tree: