I know I can remove the last element from a set:
s.Remove(s.MaximumElement)
But if I want to remove the n maximum elements… do I just execute the above n times, or is there a faster way to do that?
To be clear, this is an obvious solution:
let rec removeLastN (s : Set<'a>, num : int) : Set<'a> =
match num with
| 0 -> s
| _ -> removeLast(s.Remove(s.MinimumElement), num-1)
But it involves creating a new set n times. Is there a way to do it and only create a new set once?
To the best of my knowledge, no. I’d say what you have a perfectly fine implementation, it runs in O(lg n) — and its concise too 🙂 Most heap implementations give you O(lg n) for delete min anyway, so what you have is about as good as you can get it.
You might be able to get a little better speed by rolling your balanced tree, and implementing a function to drop a left or right branch for all values greater than a certain value. I don’t think an AVL tree or RB tree are appropriate in this context, since you can’t really maintain their invariants, but a randomlized tree will give you the results you want.
A treap works awesome for this, because it uses randomization rather than tree invariants to keep itself relatively balanced. Unlike an AVL tree or a RB-tree, you can split a treap on a node without worrying about it being unbalanced. Here’s a treap implementation I wrote a few months ago:
http://pastebin.com/j0aV3DJQ
I’ve added a
splitfunction, which will allows you take a tree and return two trees containing all values less than and all values greater than a given value.splitruns in O(lg n) using a single pass through the tree, so you can prune entire branches of your tree in one shot — provided that you know which value to split on.Using my
Treapclass:removeTopNruns in O(n + lg m) time, where n is the index into the tree sequence and m is the number of items in the tree.I make no guarantees about the accuracy of my code, use at your own peril 😉