I stumbled upon a problem with Eval monad and rpar Strategy in Haskell. Consider following code:
module Main where
import Control.Parallel.Strategies
main :: IO ()
main = print . sum . inParallel2 $ [1..10000]
inParallel :: [Double] -> [Double]
inParallel xss = runEval . go $ xss
where
go [] = return []
go (x:xs) = do
x' <- rpar $ x + 1
xs' <- go xs
return (x':xs')
inParallel2 :: [Double] -> [Double]
inParallel2 xss = runEval . go $ xss
where
go [] = return []
go [x] = return $ [x + 1]
go (x:y:xs) = do
(x',y') <- rpar $ (x + 1, y + 1)
xs' <- go xs
return (x':y':xs'
I compile and run it like this:
ghc -O2 -Wall -threaded -rtsopts -fforce-recomp -eventlog eval.hs
./eval +RTS -N3 -ls -s
When I use inParallel function parallelism works as expected. In the output runtime statistics I see:
SPARKS: 100000 (100000 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
When I switch to inParallel2 function all parallelism is gone:
SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)
Why doesn’t evaluation of tuples work in parallel? I tried forcing the tuple before passing it to rpar:
rpar $!! (x + 1, y + 1)
but still no result. What am I doing wrong?
The
rparstrategy annotates a term for possible evaluation in parallel, but only up to weak head normal form, which essentially means, up to the outermost constructor. So for an integer or double, that means full evaluation, but for a pair, only the pair constructor, not its components, will get evaluated.Forcing the pair before passing it to
rparis not going to help. Now you’re evaluating the pair locally, before annotating the already evaluated tuple for possible parallel evaluation.You probably want to combine the
rparwith therdeepseqstrategy, thereby stating that the term should be completely evaluated, if possible in parallel. You can do this by sayingThe
dotoperator is for composing strategies.There is, however, yet another problem with your code: pattern matching forces immediate evaluation, and therefore using pattern matching for
rpar-annotated expressions is usually a bad idea. In particular, the linewill defeat all parallelism, because before the spark can be picked up for evaluation by another thread, the local thread will already start evaluating it in order to match the pattern. You can prevent this by using a lazy / irrefutable pattern:
Or alternatively use
fstandsndto access the components of the pair.Finally, don’t expect actual speedup if you create sparks that are as cheap as adding one to an integer. While sparks themselves are relatively cheap, they are not cost-free, so they work better if the computation you are annotating for parallel evaluation is somewhat costly.
You might want to read some tutorials on using strategies, such as Simon Marlow’s
Parallel and Concurrent Programming using Haskell or my own Deterministic Parallel Programming in Haskell.