I have generalized the existing Data.List.partition implementation
partition :: (a -> Bool) -> [a] -> ([a],[a])
partition p xs = foldr (select p) ([],[]) xs
where
-- select :: (a -> Bool) -> a -> ([a], [a]) -> ([a], [a])
select p x ~(ts,fs) | p x = (x:ts,fs)
| otherwise = (ts, x:fs)
to a “tri-partition” function
ordPartition :: (a -> Ordering) -> [a] -> ([a],[a],[a])
ordPartition cmp xs = foldr select ([],[],[]) xs
where
-- select :: a -> ([a], [a], [a]) -> ([a], [a], [a])
select x ~(lts,eqs,gts) = case cmp x of
LT -> (x:lts,eqs,gts)
EQ -> (lts,x:eqs,gts)
GT -> (lts,eqs,x:gts)
But now I’m facing a confusing behaviour when compiling with ghc -O1, the ‘foo’ and ‘bar’ functions work in constant-space, but the doo function leads to a space-leak.
foo xs = xs1
where
(xs1,_,_) = ordPartition (flip compare 0) xs
bar xs = xs2
where
(_,xs2,_) = ordPartition (flip compare 0) xs
-- pass-thru "least" non-empty partition
doo xs | null xs1 = if null xs2 then xs3 else xs2
| otherwise = xs1
where
(xs1,xs2,xs3) = ordPartition (flip compare 0) xs
main :: IO ()
main = do
print $ foo [0..100000000::Integer] -- results in []
print $ bar [0..100000000::Integer] -- results in [0]
print $ doo [0..100000000::Integer] -- results in [0] with space-leak
So my question now is,
-
What is the reason for the space-leak in
doo, which seems suprising to me, sincefooandbardon’t exhibit such a space leak? and -
Is there a way to implement
ordPartitionin such a way, that when used in the context of functions such asdooit performs with constant space complexity?
It’s not a space leak. To find out whether a component list is empty, the entire input list has to be traversed and the other component lists constructed (as thunks) if it is. In the
doocase,xs1is empty, so the entire thing has to be built before deciding what to output.That is a fundamental property of all partitioning algorithms, if one of the results is empty, and you check for its emptiness as a condition, that check cannot be completed before the entire list has been traversed.