I have a sorted list of inputs:
let x = [2; 4; 6; 8; 8; 10; 12]
let y = [-8; -7; 2; 2; 3; 4; 4; 8; 8; 8;]
I want to write a function which behaves similar to an SQL INNER JOIN. In other words, I want to return the cartesian product of x and y which contains only items shared in both lists:
join(x, y) = [2; 2; 4; 4; 8; 8; 8; 8; 8; 8]
I’ve written a naive version as follows:
let join x y =
[for x' in x do
for y' in y do
yield (x', y')]
|> List.choose (fun (x, y) -> if x = y then Some x else None)
It works, but this runs in O(x.length * y.length). Since both my lists are sorted, I think its possible to get the results I want in O(min(x.length, y.length)).
How can I find common elements in two sorted lists in linear time?
O(min(n,m)) time is impossible: Take two lists [x;x;…;x;y] and [x;x;…;x;z]. You have to browse both lists till the end to compare y and z.
Even O(n+m) is impossible. Take
[1,1,…,1] – n times
and
[1,1,…,1] – m times
Then the resulting list should have n*m elements. You need at least O(n m) (correctly Omega(n m)) time do create such list.
Without cartesian product (simple merge), this is quite easy. Ocaml code (I don’t know F#, should be reasonably close; compiled but not tested):
(Edit: I was too late)
So your code in O(n m) is the best possible in worst case. However, IIUIC it performs always n*m operations, which is not optimal.
My approach would be
1) write a function
group : ‘a list -> (‘a * int) list
that counts the number of same elements:
group [1,1,1,1,1,2,2,3] == [(1,5);(2,2);(3,1)]
2) use it to merge both lists using similar code as before (there you can multiply those coefficients)
3) write a function
ungroup : (‘a * int) list -> ‘a list
and compose those three.
This has complexity O(n+m+x) where x is the length of resulting list. This is the best possible up to constant.
Edit: Here you go: