i want to split a sentence to convert the words into tags (to make a simple full text search in Mongodb), and i dont want to save comma or colon :
phrase = "hello, this is a simple description!"
pattern = "[\"\'\!\?\:\,\;]"
i’ve tried this:
re.split(pattern, phrase)
Out[1]: ['hello', ' this is a simple description', ''] # as you can see, i've always blank characters.
i want to remove all “non letters characters”, there is phrase.replace(",", " ") but replaces only one character, so how do i use the regular expression with replace? sssomething like re.remove(pattern, phrase), is there is a loop, does this become a heavy work to the server?
non-regexsolution:use
strip(), but you need to pass all the non-letter characters to it.something like:
strip(',!*&^%#$;:+')