In C#, Say you have an array of strings, which contain only characters ‘0’ and ‘1’:
string[] input = { "0101", "101", "11", "010101011" };
And you’d like to build a function:
public void IdentifySubstrings(string[] input) { ... }
That will produce the following:
"0101 is a substring of 010101011"
"101 is a substring of 0101"
"101 is a substring of 010101011"
"11 is a substring of 010101011"
And you are NOT able to use built-in string functionality (such as String.Substring).
How would one efficiently solve this problem? Of course you could plow through it via brute force, but it just feels like there ought to be a way to accomplish it with a tree (since the only values are 0’s and 1’s, it feels like a binary tree ought to fit somehow). I’ve read a little bit about things like suffix trees, but I’m uncertain if that’s the right path to be going down.
Any efficient solutions you can think of?
First of all, You have no choice but each byte (or bit 😉 in the searched string at least once. Probably best to leave them as bytes. Then implement a Trie (or variant). Load all substrings into the trie. The node objects should contain members identifiying which to of the loaded array elements they belong. Then search it with each substring and make your matches.