So basically I have this giant regular expression pattern, and somewhere in the middle of it is the expression (?:\s(\d\d\d)|(\d\d\d\d)). At this part of the parse I’m wanting to capture either 3 digits that follows a space or 4 digits, but I don’t want the capture that comes from using the parenthesis around the whole thing (doesn’t ?: make something non-capture). I have to use parenthesis so that the “or” logic works (I think).
So potential example inputs would be something like…
- input1= giantexpression 123more characters after
- input2= giantexpression1234blahblahblah
I tried (?:\s(\d\d\d)|(\d\d\d\d)) and it gave an extra capture at least in the case where I have 4 digits. So am I doing this right or am I messed up somewhere?
Edit:
To go into more detail… here’s the current regular expression I’m working with.
pattern = @".?(\d{1,2})\s*(\w{2}).?.?.?(?:\s(\d\d\d)|(\d\d\d\d)).*"
There’s a bit of parsing I have to do at the beginning. I think Sean Johnson’s answer would still work because I wouldn’t need to use “or”. But is there a way to do it in which you DO use “or”? I think eventually I’ll need that capability.
This should work:
If you aren’t doing any logic on that subpattern, you don’t even need the parenthesis surrounding it if all you want to do is capture the digits. The following pattern:
will capture three or four digits directly following a space character.