I have 5000 vectors which are held in 5000 files. I need to find their sum. Type DF2 is just a synonym for Vector Double and made to be an instance of Num. So I read and parse all those files to list [IO DF2] and fold it:
getFinal :: IO DF2
getFinal = foldl1' (liftA2 (+)) $ map getDF2 [1..(sdNumber runParameters)]
where getDF2 i = fmap parseDF2 $ readFile ("DF2/DF2_" ++ show i)
However I get an error:
DF2: DF2/DF2_1022: openFile: resource exhausted (Too many open files)
Google revealed this question to be very common:
However, I didn’t get what is the problem with the lazy IO. If it is lazy, then why does it open files before they are needed? I didn’t understand either how to adapt the elegant solution by Duncan Coutts to my case.
It’s not that it opens files before they’re needed; it’s that it doesn’t close them until you force the entire string. A simple way to work around this problem is to force the entire string immediately after reading it; since Vectors are strict, the simplest way to do this is to force the Vector to be evaluated after parsing it:
This uses Control.Exception.evaluate; you can think of
evaluateas forcing its argument and then returning it. This only works ifparseDF2consumes the whole string, however.A more elegant solution would be to move away from lazy IO entirely, and use iteratees or something of the sort. But that’s probably not worth it for such a simple use-case.