I don’t get how can something as a Set be immutable and still have an acceptable performance.
From what I’ve read in F# Sets internally use Red Black Trees as their implementation. If each time we want to add something new to a Red Black Tree we have to basically recreate it, how can it have ever good performance? What am I missing here?
Although I am asking this for F#’s Sets, I think this is as relevant in any other language which has or uses immutable data structures.
Thanks
Almost all immutable collections are some form of balanced tree. To create a new tree, you have to reallocate nodes on the path from the change (insert, remove, “update”) to the root. As long as the tree is balanced this takes logarithmic time. If you have something like a 2-3-4 tree (similar to red-black trees) with expected outdegree three, you can handle a million elements using only 10 allocations.
And in languages where data structures are expected to be pure, they make sure allocation is fast. Allocating a four-element node is going to cost a compare, an increment, and four stores. And in many cases you can amortize the cost of a compare over several allocations.
If you want to know more about how these structures work, an excellent source is Purely Functional Data Structures by Chris Okasaki.