For a class subject, I must implement a class that looks for a pattern in a set of chars that the class receives in a chronological order. Each character the class receives has a particular source (a planete, identified by an int ID).
We have to implement the data structure ourselves, and so I implemented a String List where I store all these characters in a chronological order.
The problem is that the pattern must be matched for characters coming from the same planete (source), so pattern matching must be made on each source.
I tried to use famous pattern matching algorithms like Rabin Karp by browsing the whole list and only taking into account the currently browsed source, and then doing this for all the sources, but the performances are really lame, even worse than a naive (but synchronous) solution.
Do you have any idea about which algorithm could be more efficient in that case ? (letting me use each character I’m browsing, even if this implies storing the actual “search state” of that source somewhere, like we did for the naive implementation)
P.S: The IDs are finite (from 1 to 128) but the number of chars can go up to 10⁷
EDIT: Here are some details that will hopefully clarify things.
IntlFinder, my class,can receive characters (or array of characters) by a method Add(char* pszData, int nSource); Hence, each character is coupled with a Source ID. The pair (character, source) is stored in a StringList ComList (in chronological order of their addition).
For the pattern to be present in my class, it must be present for THE SAME SOURCE.
Example:
If I’m looking for the pattern SAYKOUK
(S, 1); (A, 1); (Y, 1); (K, 1); (Z, 2); (S, 3); (O, 1); (U, 1); (K, 1)
is OK !
(S, 1); (A, 1); (Y, 1); (K, 2); (O, 3); (U, 1); (K, 4)
is not OK.
This is problametic because if I only consider one source (ranging from 1 to 128) and browse the whole list each time, my pattern searching method is REALLY slow. And I can’t manage with any of these algorithms to take into account the characters of the different sources and know whenever I met my pattern with any of them !
I ended up using a linked list with the classical “next” and “previous” pointers but also “nextSource” and “previousSource” that points to the characters of the same source. That way, I was able to use classical pattern-matching algorithms.