I need to take a string and extract every instance of a pattern and only the pattern.
String test = "This is a test string to experiment with regex by separating every instance of the word test and words that trail test";
So now the pattern would have to find the word test as well as any words ahead and behind it that are not test. So basically it would have to result in 3 instances of this pattern being found.
The 3 results that I’m expecting are as follows:
This is a test string to experiment with regex by separating every instance of the wordtest and words that trailtest
I’ve played around with postive lookahead and negative lookahead on gskinner but no luck yet.
Try this
See it here on Regexr.
In Java, I would replace
[a-z]with\p{L}, but regexr does not support Unicode properties.\p{L}is a Unicode code point with the property letter, this will match every letter in any language.Explanation:
(\s*\b(?!test\b)[a-z]+\b\s*)*is matching a series of words that are not “test”. This is ensured by the negative lookahead assertion(?!test\b).testis matching “test”and at the end the same again: match a series of words that are not “test” with again
(\s*\b(?!test\b)[a-z]+\b\s*?)*