I have functions like:
millionsOfCombinations = [[a, b, c, d] |
a <- filter (...some filter...) someListOfAs,
b <- (...some other filter...) someListOfBs,
c <- someListOfCs, d <- someListOfDs]
aLotOfCombinationsOfCombinations = [[comb1, comb2, comb3] |
comb1 <- millionsOfCombinations,
comb2 <- millionsOfCombinations,
comb3 <- someList,
...around 10 function calls to find if
[comb1, comb2, comb3] is actually useful]
Evaluating millionsOfCombinations takes 40s. on a very fast workstation. Evaluating aLotOfCombinationsOfCombinations!!0 took 2 days 🙁
How can I speed up this code? So far I’ve had 2 ideas – use a profiler. Tried running myapp +RTS -sstderr after compiling with GHC, but get a blank screen and don’t want to wait days for it to finish.
2nd thought was to somehow cache millionsOfCombinations. Do I understand correctly that for each value in aLotOfCombinationsOfCombinations, millionsOfCombinations gets evaluated multiple times? If that is so, how can I cache the result? Obviously I’ve just started learning Haskell. I know there is a way to do call caching with a monad, but I still don’t understand those things.
Use the
-fforce-recomp,-O2and-fllvmflagsIf you aren’t already, be sure to use the above flags. I wouldn’t normally mention it, but I’ve seen some questions recently that didn’t know powerful optimization isn’t a default.
Profile Your Code
The
-sstderrflag isn’t exactly profiling. When people say profiling they’re usually talking about either heap profiling or time profiling via-profand-auto-allflags.Avoid Costly Primitives
If you need the entire list in memory (i.e. it isn’t going to be optimized away) then consider unboxed vectors. If
Intwill do instead ofInteger, consider that (but Integer is a reasonable default when you don’t know!). Use worker/wrapping transforms at the right times. If you’re leaning heavily onData.Map, try usingData.HashMapfrom the unordered-containers library. This list can go on and on, but since you don’t already have an intuition on where your computation time is going the profiling should come first!