I would like to remove text from my strings that start with "\", such as:
\xf, \africa\87, \ckat\x70, ...
Is there a way of doing this using greedy characters in re.sub?
e.g.:
line = re.sub("[\.*]", "", line)
Thanks!
EDIT:
input example:
" lorem ipsum \xe2\x80\x9csianhill7 lorem ipsum"
output:
" lorem ipsum lorem ipsum"
If I understand your question correctly, you want to remove all non-ascii prefixes words from your sentences
You can easily do it through a
single pass LCwithordinal matchandfilterwithout employingregex