Given a sequence such as S = {1,8,2,1,4,1,2,9,1,8,4}, I need to find the minimal-length subsequence that contains all element of S (no duplicates, order does not matter). How do find this subsequence in an efficient way?
Note: There are 5 distinct elements in S: {1,2,4,8,9}. The minimum-length subsequence must contain all these 5 elements.
Algorithm:
First, determine the quantity of different elements in the array – this can be easily done in linear time. Let there be
kdifferent elements.Allocate an array
curof size 10^5, each showing how much of each element is used in current subsequence (see later).Hold a
cntvariable showing how many different elements are there currently in the considered sequence. Now, take two indexes,beginandendand iterate them through the array the following way:cntandbeginas0,endas-1(to get0after first increment). Then while possible perform follows:If
cnt != k:2.1. increment
end. Ifendalready is the end of array, then break. Ifcur[array[end]]is zero, incrementcnt. Incrementcur[array[end]].Else:
2.2 {
Try to increment the
beginiterator: whilecur[array[begin]] > 1, decrement it, and increment thebegin(cur[array[begin]] > 1means that we have another such element in our current subsequence). After all, compare the[begin, end]interval with current answer and store it if it is better.}
After the further process becomes impossible, you got the answer. The complexity is
O(n)– just passing two interators through the array.Implementation in C++: