I have a python code like below to search all the English names:
a = "Bonds met Susann ("Sun") Margreth Branco, the mother of his first two children, in {{city-state|Montreal|Quebec}} in August 1987. They eloped in {{city-state|Las Vegas|Nevada}} Barry Bonds"
re.findall("(?:[A-Z][a-z'.]+\s*){1,4}",a)
I want it to return :
['Bonds', 'Susann ("Sun") Margreth Branco', 'Montreal', 'Quebec', 'August', 'They', 'Las Vegas','Nevada','Barry Bonds']
My code cannot get what I want, How to modify the regex to achieve my goal?
And I want to add that I used another regex, (?:(([A-Z][a-z'.]+)|(\(".*"\)))\s*){1,4}. I test it on regexpal.com, it finds what I want on that test website, but in Python, it just doesn’t return what I want, but returns me Susan and ("Sun") Margreth and Branco, three separately, but I want Susan ("Sun") Margreth Branco in my result
As you mentioned, the string with “&quto” looked as delimit as well:
Output: