I am solving the Programming assinment for Harvard CS 51 programming course in ocaml.
The problem is to define a function that can compress a list of chars to list of pairs where each pair contains a number of consequent occurencies of the character in the list and the character itself, i.e. after applying this function to the list [‘a’;’a’;’a’;’a’;’a’;’b’;’b’;’b’;’c’;’d’;’d’;’d’;’d’] we should get the list of [(5,’a’);(3,’b’);(1,’c’);(4,’d’)].
I came up with the function that uses auxiliary function go to solve this problem:
let to_run_length (lst : char list) : (int*char) list =
let rec go i s lst1 =
match lst1 with
| [] -> [(i,s)]
| (x::xs) when s <> x -> (i,s) :: go 0 x lst1
| (x::xs) -> go (i + 1) s xs
in match lst with
| x :: xs -> go 0 x lst
| [] -> []
My question is: Is it possible to define recursive function to_run_length with nested pattern matching without defining an auxiliary function go. How in this case we can store a state of counter of already passed elements?
The way you have implemented
to_run_lengthis correct, readable and efficient. It is a good solution. (only nitpick: the indentation afterinis wrong)If you want to avoid the intermediary function, you must use the information present in the return from the recursive call instead. This can be described in a slightly more abstract way:
x::xsis,xsstart withx, then …(x,1) ::run length encoding ofxs(I intentionally do not provide source code to let you work the detail out, but unfortunately there is not much to hide with such relatively simple functions.)
Food for thought: You usually encounter this kind of techniques when considering tail-recursive and non-tail-recursive functions (what I’ve done resembles turning a tail-rec function in non-tail-rec form). In this particular case, your original function was not tail recursive. A function is tail-recursive when the flows of arguments/results only goes “down” the recursive calls (you return them, rather than reusing them to build a larger result). In my function, the flow of arguments/results only goes “up” the recursive calls (the calls have the least information possible, and all the code logic is done by inspecting the results). In your implementation, flows goes both “down” (the integer counter) and “up” (the encoded result).
Edit: upon request of the original poster, here is my solution: