I need a regex to match expressions which contain the string OKAY then a possible hyphen, and then zero or one word characters. after this any non-word-character is accepted and then anything. for expressions which match, OKAY will be changed to OK if there is no word-character following, and to e.g: OA if the letter following is A. if the hyphen exists it is dropped.
OKAY => OK
OKAY- => OK
OKAYA => OA
OKAY-A => OA
OKAYAB => OKAYAB (no-match)
OKAY-AB => OKAY-AB (no-match)
examples may be followed by e.g: .CD without changing the results
OKAY.CD => OK.CD
OKAY-.CD => OK.CD
OKAYA.CD => OA.CD
OKAY-A.CD => OA.CD
OKAYAB.CD => OKAYAB.CD (no-match)
OKAY-AB.CD => OKAY-AB.CD (no-match)
my problem implementing this was that since both the hyphen and the word-character are optional, I get “lazy” matches which match also the non-wanted cases.
for the sake of education I would appreciate examples both with and without look-aheads (if possible).
Here is a regex that should work for you:
Since it isn’t clear what language you are using, here is pseudo code for how you would do the replacement.
Here is a rubular: http://www.rubular.com/r/SE8MBkUUUo
edit: I made some changes in the above regex after the comments, but the description below does not reflect those changes. Here are the changes from the original regex:
^to\bso it doesn’t need to start at beginning of line\Wbecame[^\w\s], this preventsOKAY OKAYfrom being one match.*to\S*so the match will end at whitespace$to(?!\S),(?!\S)means “only match if we are at the end of the string or the next character is whitespace”, could also be written as(?=\s|\z)The really tricky part here is that a regex like
^OKAY-?(\w)?(\W.*)?$looks like it would work, but it does not for a case likeOKAY-ABbecause in the end both the-?and the(\w)?will not match, and then(\W.*)?will match the remainder of the string.What we need to do to fix this is make it so
-?will not backtrack. This would be simple if possessive quantifiers were supported by .NET, then we could just change it to-?+.Unfortunately they aren’t supported, so we need to use atomic grouping instead.
(?>-?)will optionally match a-, but will forget all backtracking information as soon as it exits the group. Note that the atomic group does not capture, so(\w)?is capture group 1.