I have a function
myLength = foldl (\ x _ -> x + 1) 0
which fails with stack overflow with input around 10^6 elements (myLength [1..1000000] fails). I believe that is due to the thunk build up since when I replace foldl with foldl’, it works.
So far so good.
But now I have another function to reverse a list :
myReverse = foldl (\ acc x -> x : acc) []
which uses the lazy version foldl (instead of foldl’)
When I do
myLength . myReverse $ [1..1000000].
This time it works fine. I fail to understand why foldl works for the later case and not for former?
To clarify here myLength uses foldl’ while myReverse uses foldl
Here’s my best guess, though I’m no expert on Haskell internals (yet).
While building the thunk, Haskell allocates all the intermediate accumulator variables on the heap.
When performing the addition as in
myLength, it needs to use the stack for intermediate variables. See this page. Excerpt:However, when performing list construction, here’s what I think happens (this is where the guesswork begins):
Note that z999999, z999998 etc. changing from a not-yet-evaluated expression into a single list item is an everyday Haskell thing 🙂
Since z1000000, z999999, z999998, etc. are all on the heap, these operations don’t use any stack space. QED.