I like the solution povided by “Remove not alphanumeric characters from string. Having trouble with the [\] character” but how would I do this while leaving the spaces in place?
I need to tokenize string based on the spaces after it has been cleaned.
Shamelessly stolen from the other answer.
^in the character class means “not.” So this is “not”\w(equivalent to\W) and not\s, which is space characters (spaces, tabs, etc.) You can just use the literalif you need.