Just got my feet wet in sorting algorithm with Haskell. I’ve implemented insertion-sort and merge-sort
insert_sort :: (Ord a, Show a) => [a] -> [a]
insert_sort keys = foldr f [] keys
where f key [] = [key]
f key acc = insert key acc
insert y [] = [y]
insert y (x:xs)
| x < y = x : insert y xs
| otherwise = y : x : xs
merge_sort :: (Ord a, Show a) => [a] -> [a]
merge_sort (x:[]) = [x]
merge_sort keys = merge (merge_sort (take len keys)) (merge_sort (drop len keys))
where len = length keys `div` 2
merge :: [a] -> [a] -> [a]
merge (x:xs) [] = (x:xs)
merge [] (y:ys) = (y:ys)
merge (x:xs) (y:ys) = if x <= y
then x : merge (xs) (y:ys)
else y : merge (x:xs) ys
Here’s how I compared their efficiency:
insert_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]
merge_sort $ take 100000 $ randomRs (1,100000) $ mkStdGen 1 ::[Int]
Both of them starts to print out results after a short delay but merge-sort prints much faster. As we know, merge-sort is much faster than insertion-sort for large data sets. I thought that would be shown by how they give results (like a long delay versus a short one) not how they print results. Is it because I use foldr in insertion-sort? What’s behind the scene?
EDIT: Thx guys. I’ve heard about lazy evaluation since I started to learn Haskell but yet got the hang of it. Would anybody illustrate a bit more with a small data set, say [5,2,6,3,1,4]? How is it possible to output results before finish sorting with foldr since the first elements comes at last?
Behind the scene is lazy evaluation. The start of the sorted lists is determined before the sort is complete, so it can be output before the work is finished. Since a mergesort is faster, the merge-sorted list is printed out faster.
As requested: how sorting
[5,2,6,3,1,4]proceeds. I useinsert_sort = foldr ins []for brevity.And merge sort (abbreviations:
merge = mg,merge_sort = ms):Admittedly I’ve taken a few short cuts, but Haskell isn’t the only lazy one.