I have a performance question today.
I am making a (Haskell) program and, when profiling, I saw that most of the time is spent in the function you can find below. Its purpose is to take the nth element of a list and return the list without it besides the element itself. My current (slow) definition is as follows:
breakOn :: Int -> [a] -> (a,[a])
breakOn 1 (x:xs) = (x,xs)
breakOn n (x:xs) = (y,x:ys)
where
(y,ys) = breakOn (n-1) xs
The Int argument is known to be in the range 1..n where n is the length of the (never null) list (x:xs), so the function never arises an error.
However, I got a poor performance here. My first guess is that I should change lists for another structure. But, before start picking different structures and testing code (which will take me lot of time) I wanted to ask here for a third person opinion. Also, I’m pretty sure that I’m not doing it in the best way. Any pointers are welcome!
Please, note that the type a may not be an instance of Eq.
Solution
I adapted my code tu use Sequences from the Data.Sequence module. The result is here:
import qualified Data.Sequence as S
breakOn :: Int -> Seq a -> (a,Seq a)
breakOn n xs = (S.index zs 0, ys <> (S.drop 1 zs))
where
(ys,zs) = S.splitAt (n-1) xs
However, I still accept further suggestions of improvement!
Yes, this is inefficient. You can do a bit better by using
splitAt(which unboxes the number during the recursive bit), a lot better by using a data structure with efficient splitting, e.g. a fingertree, and best by massaging the context to avoid needing this operation. If you post a bit more context, it may be possible to give more targeted advice.