This is an interview question (phone screen): write a function (in Java) to find all permutations of a given word that appear in a given text. For example, for word abc and text abcxyaxbcayxycab the function should return abc, bca, cab.
I would answer this question as follows:
-
Obviously I can loop over all permutations of the given word and use a standard
substringfunction. However it might be difficult (for me right now) to write code to generate all word permutations. -
It is easier to loop over all text substrings of the word size, sort each substring and compare it with the “sorted” given word. I can code such a function immediately.
-
I can probably modify some substring search algorithm but I do not remember these algorithms now.
How would you answer this question?
This is probably not the most efficient solution algorithmically, but it is clean from a class design point of view. This solution takes the approach of comparing “sorted” given words.
We can say that a word is a permutation of another if it contains the same letters in the same number. This means that you can convert the word from a
Stringto aMap<Character,Integer>. Such conversion will have complexity O(n) where n is the length of theString, assuming that insertions in yourMapimplementation cost O(1).The
Mapwill contain as keys all the characters found in the word and as values the frequencies of the characters.Example. abbc is converted to
[a->1, b->2, c->1]bacb is converted to
[a->1, b->2, c->1]So if you have to know if two words are one the permutation of the other, you can convert them both into maps and then invoke
Map.equals.Then you have to iterate over the text string and apply the transformation to all the substrings of the same length of the words that you are looking for.
Improvement proposed by Inerdial
This approach can be improved by updating the Map in a “rolling” fashion.
I.e. if you’re matching at index
i=3in the example haystack in the OP (the substringxya), the map will be[a->1, x->1, y->1]. When advancing in the haystack, decrement the character count forhaystack[i], and increment the count forhaystack[i+needle.length()].(Dropping zeroes to make sure
Map.equals()works, or just implementing a custom comparison.)Improvement proposed by Max
What if we also introduce
matchedCharactersCntvariable? At the beginning of the haystack it will be0. Every time you change your map towards the desired value – you increment the variable. Every time you change it away from the desired value – you decrement the variable. Each iteration you check if the variable is equal to the length of needle. If it is – you’ve found a match. It would be faster than comparing the full map every time.Pseudocode provided by Max: