I knew that [] denotes a set of allowable characters –
>>> p = r'^[ab]$'
>>>
>>> re.search(p, '')
>>> re.search(p, 'a')
<_sre.SRE_Match object at 0x1004823d8>
>>> re.search(p, 'b')
<_sre.SRE_Match object at 0x100482370>
>>> re.search(p, 'ab')
>>> re.search(p, 'ba')
But … today I came across an expression with vertical bars within parenthesis to define mutually exclusive patterns –
>>> q = r'^(a|b)$'
>>>
>>> re.search(q, '')
>>> re.search(q, 'a')
<_sre.SRE_Match object at 0x100498dc8>
>>> re.search(q, 'b')
<_sre.SRE_Match object at 0x100498e40>
>>> re.search(q, 'ab')
>>> re.search(q, 'ba')
This seems to mimic the same functionality as above, or am I missing something?
PS: In Python parenthesis themselves are used to define logical groups of matched text. If I use the second technique, then how do I use parenthesis for both jobs?
In this case it is the same.
However, the alternation is not just limited to a single character. For instance,
will match “hello” or “world” (and only these two inputs) while
would just match a single character (“h” or “w” or “d” or whatnot).
Happy coding.