I’m wondering why
Prelude> head $ reverse $ [1..10000000] ++ [99]
99
does not lead to a stack overflow error. The ++ in the prelude seems straight forward and non-tail-recursive:
(++) :: [a] -> [a] -> [a]
(++) [] ys = ys
(++) (x:xs) ys = x : xs ++ ys
EDIT: Initially, I thought the issue has something to do with the way ++ is defined in the prelude, especially with the rewriting rules, hence the question continued as below. The discussion showed me that this is not the case. I think now that some lazy evaluation effect causes the code to run without a stack overflow, but I don’t quite figure how.
So just with this, it should run into a stack overflow, right? So I figure it probably has something to do with the ghc magic that follows the definition of ++:
{-# RULES
“++” [~1] forall xs ys. xs ++ ys = augment (\c n -> foldr c n xs) ys
#-}
*Is that what helps avoiding the stack overflow? Could someone provide some hint for what’s going on in this piece of code?**
This doesn’t stack overflow – even in the interpreter, where there are no optimizations and no rewrite rules – because it doesn’t use the stack.
Look at the definition of (++), for example,:
The key thing is
x : (xs ++ ys)— that is, it is recursion guarded by the (:) “cons” constructor. Because Haskell is lazy, it allocates a thunk for the cons operation, and the recursive call goes onto this (heap-allocated) thunk. So your stack allocation is now heap allocation, which can expand quite a bit. So all this does is walk the big list, allocating new cons objects on the heap to replace the ones it is traversing. Easy!“reverse” is a bit different:
That is a tail recursive, accumulator-style function, so again, it will allocate on the heap.
So you see, the functions rely on using cons cells on the heap, instead of on the stack, hence no stack overflow.
To really nail this, look at the runtime stats from the GC vm:
There’s your big list — it is allocated on the heap, and we spend 80% of the time cleaning up cons nodes that are created by (++).
Lesson: you can often trade stack for heap.