I’m struggling to write a perl-compatible regex that will be reasonably smart about distinguishing strings that refer to the republic of the congo and the democratic republic of the congo. I’ll be using this expression in a program for R‘s grep function which returns True if the regex matches the string and False otherwise.
The country I’m interested to identify can sometimes be written in different orders/ways. For example:
republic of congo
republic of the congo
congo, republic of the
congo, republic
The country I do not want to match has similar patterns:
democratic republic of the congo
congo, democratic republic of the
dem rep of the congo
What I’m looking for, I guess, is a regex that would match on rep and congo, but would fail any time there’s a “dem” in the string.
Any ideas? Thanks!
This is matches your first sample strings and ignores the second
In Perl this becomes