I’m trying to get pyparsing to extract a sub-string consisting of a variable number of words from a string.
The following almost works but loses the last word of the sub-string:
text = "Joe F Bloggs is the author of this book."
author = OneOrMore(Word(alphas) + ~Literal("is the"))
print author.parseString(text)
Output:
['Joe', 'F']
What am I missing?
PS: I know I can do this with a regular expression but specifically want to do it with pyparsing because it needs to fit into a large effort already written using pyparsing.
Your negative lookahead has to come before the actual author word: