I’m just digging a bit into Haskell and I started by trying to compute the Phi-Coefficient of two words in a text. However, I ran into some very strange behaviour that I cannot explain.
After stripping everything down, I ended up with this code to reproduce the problem:
let sumTup = (sumTuples∘concat) frequencyLists
let sumFixTup = (138, 136, 17, 204)
putStrLn (show ((138, 136, 17, 204) == sumTup))
putStrLn (show (phi sumTup))
putStrLn (show (phi sumFixTup))
This outputs:
True
NaN
0.4574206676616167
So although the sumTupand sumFixTup show as equal, they behave differently when passed to phi.
The definition of phi is:
phi (a, b, c, d) =
let dividend = fromIntegral(a * d - b * c)
divisor = sqrt(fromIntegral((a + b) * (c + d) * (a + c) * (b + d)))
in dividend / divisor
This might be a case of integer overflow. The value being passed into
fromIntegralin your divisor is 3191195800, which is larger than a 32-bit signed Int can hold.In ghci (or whatever you’re using), use
to see the types of those variables. I’m guessing you’ll find that
sumTupis(Int, Int, Int, Int)(overflows) andsumFixTupis(Integer, Integer, Integer, Integer)(doesn’t overflow).Edit: on second thought, a tuple of Ints can’t be equal to a tuple of Integers. Even so, I think that ghci will fix the type of
sumFixTupto be a tuple of Integers, whilesumTupprobably has a type of the form(Num a) => (a, a, a, a)or(Integral a) => (a, a, a, a), which depends on the function defining it.Ghci will then convert them to Integers to compare with
sumFixTup, but may convert them to Ints when calculating the divisor inphi, causing overflow.Another edit: KennyTM, you’re half right:
So for the examples given in the question:
The literal
(138, 136, 17, 204)is inferred to be a tuple ofIntto matchsumTup, and they compare equal.sumTupconsists ofInts, causing overflow as suggested above.sumFixTupconsists ofIntegers, giving a correct result. Note thatsumTupandsumFixTupwere never compared directly, so my earlier edit was based on a misreading.