Can python expand what is matched by w+ and W+? How would I add more characters to its list?
Why? Going through some text and finding there are some characters I would like to add to the word definition such as & and æ.
If I cannot add to the word definition, then how do I add to my functions:
re.findall(r'\w+', txt)
re.findall(r'\W+', txt)
Well
\wis a predefined set of characters, you can’t programatically modify the meaning of\w. But you can setup a group that will match any character in\wplus any other characters you want using the[]syntax. So you’d change your regex toor
respectively
This matches any character in the
\wor\Wset and adds & and æ. You can play around with these expressions on regexpal.