I would like to better understand the interns of e.g. Data.Map. When I insert a new binding in a Map, then, because of immutability of data I get back a new data structure that is identical with the old data structure plus the new binding.
I would like to understand how this is achieved. Does the compiler eventually implement this by copying the whole data structure with e.g. millions of bindings? Can it generally be said that mutable data structures/arrays (e.g. Data.Judy) or imperative programming languages perform better in such cases? Does immutable data have any advantage when it comes to dictionaries/key-value stores?
Mapis built on a tree data structure. Basically, a newMapvalue is constructed, but it’ll be filled almost entirely with pointers to the old structure. Since values never change in Haskell, this is a safe, and very important optimisation, known as sharing.This means that you can have many similar versions of the same data structure hanging around, but only the branches of the tree that differ will be stored anew; the rest will simply be pointers to the original copy of the branch. And, of course, if you throw away the old
Map, the branches you did change will be reclaimed by the garbage collector.Sharing is key to the performance of immutable data structures. You might find this Wikipedia article helpful; it has some enlightening graphs showing how modified data gets represented with sharing.