What is a word boundary in a Python regex? Can someone please explain this on these examples:
Example 1
>>> x = '456one two three123'
>>> y=re.search(r"\btwo\b",x)
>>> y
<_sre.SRE_Match object at 0x2aaaaab47d30>
Example 2
>>> y=re.search(r"two",x)
>>> y
<_sre.SRE_Match object at 0x2aaaaab47d30>
Example 3
>>> ip="192.168.254.1234"
>>> if re.search(r"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b",ip):
... print ip
...
Example 4
>>> ip="192.168.254.1234"
>>> if re.search(r"\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}",ip):
... print ip
192.168.254.1234
“word boundary” means exactly what it says: the boundary of a word, i.e. either the beginning or the end.
It does not match any actual character in the input, but it will only match if the current match position is at the beginning or end of the word.
This is important because, unlike if you just matched whitespace, it will also match at the beginning or end of the entire input.
So
'\bfoo'will match'foobar'and'foo bar'and'bar foo', but not'barfoo'.'foo\b'will match'foo bar'and'bar foo'and'barfoo', but not'foobar'.