Using Python and regex I am trying to find words in a piece of text that start with a capital letter but are not at the start of a sentence.
The best way I can think of is to check that the word is not preceded by a full stop then a space. I am pretty sure that I need to use negative lookbehind. This is what I have so far, it will run but always returns nothing:
(?<!\.\s)\b[A-Z][a-z]*\b
I think the problem might be with the use of [A-Z][a-z]* inside the word boundary \b but I am really not sure.
Thanks for the help.
Your regex appears to work:
Make sure you’re using a raw string (
r'...') when specifying the regex.If you have some specific inputs on which the regex doesn’t work, please add them to your question.