I’m building this regex with a positive look ahead in it. Basically it must select all text in the line up to last period that precedes a “:” and add a “|” to the end to delimit it. Some sample text below. I am testing this in gskinner and editpadpro which has full grep regex support apparently so if I could get the answers in that for I’d appreciate it.
The regex below works to a degree but I am unsure if it is correct. Also it falls down if the text contains brackets.
Finally I would like to add another ignore rule like the one that ignores but includes “Co.” in the selection. This second ignore rule would ignore but include periods that have a single Capital letter before them. Sample text below too. Thanks for all the help.
^(?:[^|]+\|){3}(.*?)[^(?:Co)]\.(?=[^:]*?\:)
121| Ryan, T.N. |2001. |I like regex. But does it like me (2) 2: 615-631.
122| O' Toole, H.Y. |2004. |(Note on the regex). Pages 90-91 In: Ryan, A. & Toole, B.L. (Editors) Guide to the regex functionality in php. Timmy, Tommy& Stewie, Quohog. * Produced for Family Guy in Quohog.
I don’t think I understand what you want to do. But this part
[^(?:Co)]is definitely not correct.With the square brackets you are creating a character class, because of the
^it is a negated class. That means at this place you don’t want to match one of those characters(?:Co), in other words it will match any other character than “?)(:Co”.Update:
I don’t think its possible. How should I distinguish between L. Co. or something similar and the end of the sentence?
But I found another error in your regex. The last part
(?=[^:]*?\:)should be(?=[^.]*?\:)if you want to match the last dot before the:with your expression it will match on the first dot.See it here on Regexr