I’m using re.split() to separate a string into tokens. Currently the pattern I’m using as the argument is [^\dA-Za-z], which retrieves alphanumeric tokens from the string.
However, what I need is to also split tokens that have both numbers and letters into tokens with only one or the other, eg.
re.split(pattern, "my t0kens")
would return ["my", "t", "0", "kens"].
I’m guessing I might need to use lookahead/lookbehind, but I’m not sure if that’s actually necessary or if there’s a better way to do it.
Try the findall method instead.
Edit: Better way from Bart’s comment below.