I am splitting a string using “Python strings split with multiple separators“:
import re
DATA = "Hey, you - what are you doing here!?"
print re.findall(r'\w+', DATA)
# Prints ['Hey', 'you', 'what', 'are', 'you', 'doing', 'here']
I want to get a separate list of of what’s in between the matched words:
[", ", " - ", " ", " ", " ", " ", "!?"]
How do I do this?
yields the list you are looking for:
I used
\W+rather than\w+which negates the character class you were using.This Regular Expression Reference Sheet might be helpful in selecting the best character classes/meta characters for your regular expression searches/matches. Also, see this tutorial for more information (esp the reference section toward the bottom of the page)