I’m very new to Haskell, and I have a question about what performance improvements can be had by using impure (mutable) data structures. I’m trying to piece together a few different things I’ve heard, so please bear with me if my terminology is not entirely correct, or if there are some small errors.
To make this concrete, consider the quicksort algorithm (taken from the Haskell wiki).
quicksort :: Ord a => [a] -> [a]
quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs
This is not “true quicksort.” A “true” quicksort algorithm is in-place, and this is not. This is very memory inefficient.
On the other hand, it is possible to use vectors in Haskell to implement an in-place quicksort. An example is given in this stackoverflow answer.
How much faster is the second algorithm than the first? Big O notation doesn’t help here, because the performance improvement is going to be from using memory more efficiently, not having a better algorithm (right?). I tired to construct some test cases on my own, but I had difficult getting things running.
An ideal answer would give some idea of what makes the in-place Haskell algorithm faster theoretically, and an example comparison of running times on some test data set.
There’s nothing better than a test, right? And the results are not unsurprising: for lists of random integers in range
[0 .. 1000000],Here,
Data.List.sortis just what it is,Naïve.quicksortis the algorithm you quoted,UArray_IO.quicksortandVector_Mutable.quicksortare taken from the question you linked to: klapaucius’ and Dan Burton’s answer which turn out to be very suboptimal performance-wise, see what better Daniel Fischer could do it, both wrapped so as to accept lists (not sure if I got this quite right):and
respectively.
As you can see, the naïve algorithm is not far behind the mutable solution with
Data.Vectorin terms of speed for sorting a list of random-generated integers, and theIOUArrayis actually much worse. Test was carried out on an Intel i5 laptop running Ubuntu 11.10 x86-64.The following doesn’t really make much sense considering that ɢᴏᴏᴅ mutable implementations are, after all, still well ahead of all those compared here.
Note that this does not mean that a nice list-based program can always keep up with its mutably-implemented equivalents, but GHC sure does a great job at bringing the performance close. Also, it depends of course on the data: these are the times when the random-generated lists to sort contain values in between 0 and 1000 rather than 0 an 1000000 as above, i.e. with many duplicates:
Not to speak of pre-sorted arrays.
What’s quite interesting, (becomes only apparent with really large sizes, which require rtsopts to increase the stack capacity), is how both mutable implementations become significantly slower with
-fllvm -O2:It seems kind of logical to me that the immutable implementations fare better on llvm (doesn’t it do everything immutably on some level?), though I don’t understand why this only becomes apparent as a slowdown to the mutable versions at high optimisation and large data sizes.
Testing program:
which takes the algorithm name and array size on command-line. Runtime comparison was done with this program: