How to construct a regular expression search pattern to find string1 that is not followed by string2 (immediately or not)?
For for instance, if string1=”MAN” and string2=”PN”, example search results would be:
"M": Not found
"MA": Not found
"MAN": Found
"BLAH_MAN_BLEH": Found
"MAN_PN": Not found
"BLAH_MAN_BLEH_PN": Not found
"BLAH_MAN_BLEH_PN_MAN": Not found
Ideally, a one-linear search, instead of doing a second search for string2.
PS: Language being used is Python
It looks like you can use
MAN(?!.*PN). This matchesMANand uses negative lookahead to make sure that it’s not followed byPN(as seen on rubular.com).Given
MAN_PN_MAN_BLEH, the above pattern will find the secondMAN, since it’s not followed byPN. If you want to validate the entire string and make sure that there’s noMAN.*PN, then you can use something like^(?!.*MAN.*PN).*MAN.*$(as seen on rubular.com).References
Related questions
Non-regex option
If the strings are to be matched literally, then you can also check for indices of substring occurrences.
In Python,
findandrfindreturn lowest and highest index of substring occurrences respectively.So to make sure that
string1occurs but never followed bystring2, and both returns-1if the string is not found, so it looks like you can just test for this condition:This compares the leftmost occurrence of
string1and the rightmost occurrence ofstring2.-1, and result isfalsestring1occurs, butstring2doesn’t, then result istrueas expectedstring2must be to the left of the leftmoststring1string1is ever followed bystring2API links
findrfind