Here’s my implementation of a sort of treap (with implicit keys and some additional information stored in nodes): http://hpaste.org/42839/treap_with_implicit_keys
According to profiling data GC takes 80% of time for this program. As far as I understand, it’s caused by the fact that every time a node is ‘modified’, each node on the path to the root is recreated.
Is there something I can do here to improve performance or I have to descend into the realm of ST monad?
Using GHC 7.0.3, I can reproduce your heavy GC behavior:
I spent 10 minutes going through the program. Here’s what I did, in order:
Resulting in a 10 fold speedup, and GC around 45% of time.
In order, using GHC’s magic
-Hflag, we can reduce that runtime quite a bit:Not bad!
The UNPACK pragmas on the
Treenodes won’t do anything, so remove those.Inlining
updateshaves off more runtime:as does inlining
heightSo while it is fast, GC is still dominating — since we’re testing allocation, after all.
One thing we can do is increase the first gen size:
And increasing the unfolding threshold, as JohnL suggested, helps a little,
which is what, 10x faster than we started? Not bad.
Using ghc-gc-tune, you can see runtime as a function of
-Aand-H,Interestingly, the best running times use very large
-Avalues, e.g.