I am looking to find and replace text that is NOT inside of a tag.
The tags are simple, and follow this pattern:
(<[A-Z_]+>)([^<]+)(</[A-Z_]+>)
- Group 1: Opening tag
- Group 2: Tag content
- Group 3: Closing tag
The text I need to look at is like this (1 line per text item with possibly multiple tags in each line):
What movie did The programmer watch if he didn't watch <NAME>The Office</NAME>?
Where I need to match “The” before “programmer”, but NOT “The” inside of the < NAME > tag. I’m looking to change that to:
What movie did the programmer watch if he didn't watch <NAME>The Office</NAME>?
As another example, perhaps better:
What movie did The programmer watch if he didn't watch <NAME>Dawn of the Dead - The Original Director's Cut</NAME>?
Basically, I’m looking to fix case problems with the text that is outside of a tag. I do not want to touch any text inside of the tag at all. Here’s another example:
Why Don't You watch <NAME>This is Spinal Tap</NAME> on <DAY>Friday</DAY> or whenever?
There, I’d like to find:
- Don’t
- You
But not “Spinal”, etc., and end up with this:
Why don't you watch <NAME>This is Spinal Tap</NAME> on <DAY>Friday</DAY> or whenever?
All the strings that I need to look at are sentences with tagged text being placed anywhere inside of the string. No tags overlap or contain another tag.
Any help at all is appreciated. Even just a link or pointer to the right path to run down.
Thanks in advance!
Your solutions is:
Check this demo.