I’ve been trying to find the optimal solution to the following (interesting?) problem that came up at work: Eventually I settled for a good enough solution but I’d like to know if there’s a better one.
Let a1…an be an array of strings.
Let s1…sk be an unordered list of strings, all of them also members of the array.
The task is to find the minimum set of index ranges eleements of s cover in a.
So for example if a = [ “x”, “y”, “a”, “f”, “c” ] and s = { “c”,”y”,”f” }, the answer would be (1;1), (3;4), assuming that the array is indexed from zero.
a is typically fairly large (hundreds of thousands of elements), while s is relatively small, typically length(s) < log(length(a)).
So the question is: can you find a time-efficient algorithm for this problem? (Space efficiency is not a concern within reasonable limits.)
Just a quick but important update: I need to perform this operation with different s values but the same a a lot. So precomputing stuff based on a is allowed, indeed it is the only way.
Build a hash table
H(a)to map from element to index:ax->xinO(n)time and space. Then look up each sy inH(a)(inO(1)time on average for a total ofO(k)fors) and keep track of the ranges. For that you can use an array ofpair(min_index, max_index)sorted bymin_indexand do a binary search to either locate the range or where you should insert the new 1 element range.So overall, the solution above would take
O( n + k + k * log( nb_ranges ) )time andO( n + nb_ranges )space.