This sounds very simple, I know, but for some reason I can’t get all the results I need
Word in this case is any char but white-space that is separetaed with white-space
for example in the following string: “Hello there stackoverflow.”
the result should be: [‘Hello’,’there’,’stackoverflow.’]
My code:
import re
word_pattern = "^\S*\s|\s\S*\s|\s\S*$"
result = re.findall(word_pattern,text)
print result
but after using this pattern on a string like I’ve shown it only puts the first and the last words in the list and not the words separeted with two spaces
What is the problem with this pattern?
Use the
\bboundary test instead:Result:
or not use a regular expression at all and just use
.split(); the latter would include the punctiation in a sentence (the regex above did not match the.in the sentence).