I’m learning Haskell and am trying to write code as fast as I can do in C. For this exercise, I’m writing a Euler integrator for a simple one-dimensional physical system.
- The C code is compiled with GCC 4.5.4 and
-O3. It runs in 1.166 seconds. - The Haskell code is compiled with GHC 7.4.1 and
-O3. It runs in 21.3 seconds. - If I compile Haskell with
-O3 -fllvm, it runs in 4.022 seconds.
So, am I missing something to optimize my Haskell code?
PS.: I used the following arguments: 1e-8 5.
C code:
#include <stdio.h>
double p, v, a, t;
double func(double t) {
return t * t;
}
void euler(double dt) {
double nt = t + dt;
double na = func(nt);
double nv = v + na * dt;
double np = p + nv * dt;
p = np;
v = nv;
a = na;
t = nt;
}
int main(int argc, char ** argv) {
double dt, limit;
sscanf(argv[1], "%lf", &dt);
sscanf(argv[2], "%lf", &limit);
p = 0.0;
v = 0.0;
a = 0.0;
t = 0.0;
while(t < limit) euler(dt);
printf("%f %f %f %f\n", p, v, a, t);
return 0;
}
Haskell Code:
import System.Environment (getArgs)
data EulerState = EulerState !Double !Double !Double !Double deriving(Show)
type EulerFunction = Double -> Double
main = do
[dt, l] <- fmap (map read) getArgs
print $ runEuler (EulerState 0 0 0 0) (**2) dt l
runEuler :: EulerState -> EulerFunction -> Double -> Double -> EulerState
runEuler s@(EulerState _ _ _ t) f dt limit = let s' = euler s f dt
in case t `compare` limit of
LT -> s' `seq` runEuler s' f dt limit
_ -> s'
euler :: EulerState -> EulerFunction -> Double -> EulerState
euler (EulerState p v a t) f dt = (EulerState p' v' a' t')
where t' = t + dt
a' = f t'
v' = v + a'*dt
p' = p + v'*dt
I got a nice boost by applying a worker-wrapper transformation to
runEuler.This helps
fget inlined into the loop (which probably also happens in the C version), getting rid of a lot of overhead.