I have a string like this:
inputString = "this is the first sentence in this book the first sentence is really the most interesting the first sentence is always first"
and a dictionary like this:
{
'always first': 0,
'book the': 0,
'first': 0,
'first sentence': 0,
'in this': 0,
'interesting the': 0,
'is always': 0,
'is really': 0,
'is the': 0,
'most interesting': 0,
'really the': 0,
'sentence in': 0,
'sentence is': 0,
'the first': 0,
'the first sentence': 0,
'the first sentence is': 0,
'the most': 0,
'this': 0,
'this book': 0,
'this is': 0
}
What is the most efficient way of updating the frequency counts of this dictionary in one pass of the input string (if it is possible)? I get a feeling that there must be a parser technique to do this but am not an expert in this area so am stuck. Any suggestions?
Check out the Aho-Corasick algorithm.