I’m trying to write a program in Java that looks for patterns of strings inside a text file.
Consider the following text, taken from a novel:
She was a very awesome woman, he thought. Then she said: “Hello, my name’s Lauren. What’s yours?”
I’d like to find a way to find any occurrence of this sequence of words: HELLO , any string, NAME (taken from a list), so that, from the example above, I would get (in bold):
She was a very awesome woman, he thought. Then she said: “Hello, my name’s Lauren. What’s yours?”
At first I thought about using regex, then I considered writing a parser (maybe a JFlex or ANTRL generated one).
Anyone knows about an easier, and hopefully quicker to code, solution?
You can try the stanford POS tagger to tag parts of sentences, and then fetch those sentences with the criterias that you are looking for.