I am trying to delete all numbers from a string as long as the number ends in ” “,”grams”,”g”,”kg” or “kilograms”.
I am using a regular expression but its not removing any numbers. Whats going wrong?
For example; the string "abc 1231g kjsjk jkdsfkjdkj 11kg" is should produce "abc kjsjk jkdsfkjdkj "
Python code:
from re import sub
test = "abc 1231g kjsjk jkdsfkjdkj 11kg"
test = sub("[\d]+[\sg|$grams|$kg|$kilograms]$"," ",test)
print test # every number is still there
Your regular expression is not capturing what you’re looking for. The square brackets
[]indicate defining a character class, so[\sg|$...]isn’t what you want. You should try:Here, we start with
\d+for the number, and then use parentheses()for grouping and put all the possible suffixes in it, separated by|.To get the output you specified, we need to change a few more things. The replacement string should be
""instead of" ", and we need to be able to pick up an extra space at the end by appending\s?to the regex.