We’re interested in a data structure for binary strings. Let S=s1s2….sm be a binary string of size m. Shift(S,i) is a cyclic shift of string S i spaces to the left. That is, Shift(S,i)=sisi+1si+2…sms1…si-1. Suggest an efficient data structure that supports:
Init()of an empy DS in O(1)Insert(s)inserts a binary string to the DS in O(|s|^2)Search_cyclic(s)checks if there is aShift(S,i)for ANYiin O(|s|).
Space Complexity: O(|S1|+|S2|+…..+|Sm|) where Si is one if the m strings we’ve inserted this far.
If i had to find Search_cyclic(s,i) for some given i, this is quite simple with using a suffix tree and just traversing it in O(|s|). But here in Search_cyclic(s) we don’t have a given i, so I don’t know what to do in the given complexity. OTOH, Insert(s) generally takes O(|s|) to insert to a suffix tree and here we are given O(|s|^2).
So here is a solution I can propose to you. The complexities are even lower then the ones they asked of you but it may seem a bit complicated.
The data structure in which you keep all the strings will be a Trie or even a Patricia tree. In this tree for each string you want to insert the minimum cyclic shift(i.e. the cyclic shift of all possible ones which is minimum lexicographically) out of all of its possible shifts. You can calculate the minimum cyclic shift of a string in linear time and I will give one possible solution to that a bit later. For the moment lets assume you can do it. Here is how the operations required will be implemented:
Also the memory complexity is as required and may be even lower if you construct a Patricia.
So all that is left is to exaplain how to find the minimum cyclic shift. Since you mention suffix tree I hope you know how to construct it in linear time. So the trick is – you append your string s to itself(i.e. double it) and then you construct a suffix tree for the doubled string. This is still linear with respect to |s| so no problem there. After that all you have to do is to find the minimum of the suffixes of length n in this tree. This is not hard at all I believe – start from the root and always follow the link from the current node that has minimal string written on it until you accumulate length longer then |s|. Because of the doubling of the string, you will always be able to follow minimal string links until you accumulate length at least |s|.
Hope this answer helps.