In trying to learn Haskell, I have implemented a pi calculation in order to understand functions and recursion properly.
Using the Leibniz Formula for calculating pi, I came up with the following, which prints pi to the tolerance of the given parameter, and the number of recursive function calls in order to get that value:
reverseSign :: (Fractional a, Ord a) => a -> a
reverseSign num = ((if num > 0
then -1
else 1) * (abs(num) + 2))
piCalc :: (Fractional a, Integral b, Ord a) => a -> (a, b)
piCalc tolerance = piCalc' 1 0.0 tolerance 0
piCalc' :: (Ord a, Fractional a, Integral b) => a -> a -> a -> b -> (a, b)
piCalc' denom prevPi tolerance count = if abs(newPi - prevPi) < tolerance
then (newPi, count)
else piCalc' (reverseSign denom) newPi tolerance (count + 1)
where newPi = prevPi + (4 / denom)
So when I run this in GHCI, it seems to work as expected:
*Main> piCalc 0.001
(3.1420924036835256,2000)
But if I set my tolerance too fine, this happens:
*Main> piCalc 0.0000001
(3.1415927035898146,*** Exception: stack overflow
This seems wholly counter-intuitive to me; the actual calculation works fine, but just trying to print how many recursive calls fails??
Why is this so?
This is a variant of the traditional
foldl (+) 0 [1..1000000]stack overflow. The problem is that the count value is never evaluated during the evaluation ofpiCalc'. This means that it just carries an ever-growing set of thunks representing the addition to be done if needed. When it is needed, the fact that evaluating it requires stack depth proportional to the number of thunks causes the overflow.The simplest solution makes use of the
BangPatternsextension, changing the start ofpiCalc'toThis forces the value of
countto be evaluated when the pattern is matched, which means that it will never grow a giant chain of thunks.Equivalently, and without the use of an extension, you could write it as
This is exactly equivalent semantically to the above solution, but it uses
seqexplicitly instead of implicitly via a language extension. This makes it more portable, but a bit more verbose.As for why the approximation of pi is not a long sequence of nested thunks, but count is:
piCalc'branches on the result of a computation that requires the values ofnewPi,prevPi, andtolerance. It must examine those values before it decides if it’s done or if it needs to run another iteration. It’s that branch that causes the evaluation to be performed (when the function application is performed, which usually means something is pattern-matching on the result of the function.) On the other hand, nothing in the calculation ofpiCalc'depends on the value ofcount, so it isn’t evaluated during the calculation.