Note: I’m using a 3rd party app that uses regex for searches which has its own flavor but almost always works like java’s flavor of regex. Of course this may not matter.
After searching for many different ways of this same question (phrased many ways), I did not see any tutorials, examples, or even mentions of whether it is possible to use both an “is” (positive?) and “is not” (negative?) definition within the same range.
I can’t run a test the example right now in the app to see if my ideas work, because the amount of data being searched is massive and will screw up the matches it has already gathered. I’m only asking because of this.
Here are examples of what I thought might work but caused tester to act weird:
[\w^\s<>.!?]{2}
[\w|^\s<>.!?]{2}
I would rather have it work the way I think the first one would work (any digit, lower case, or upper case character, or other normal character that is not a space, >, <, period, !, or ?) rather then the second which only has an or operator.
The regex testers I used gave me different funky results which is what is confusing me.
Also note: I’m using this within a capture group which is followed by a catch everything match which I may or may not be using properly. So if you’d like to include how to follow what I’m attempting with how to properly do that, feel free. I AM MAINLY JUST CURIOUS TO IF THIS WAS POSSIBLE OR NOT, OR IF IT WAS A IMPROPER METHOD.
Why do you need the
\wat all?This already matches all alphanumeric characters since they are neither space nor any of the punctuation characters you mentioned.
In general, you can substract character classes to some degree, for example, to match alphanumerics exluding digits, you can do
because
[^\W]matches the same as\w, and\dis substracted from that because it’s in a negated character class.Edit:
Some regex engines (like XPath, .NET and JGSoft) allow flexible character class substraction like this:
to match any character from the range
[a-z], excludinge,fandg. But Java does not have this feature.