I want to create a regular expression for python snippet.
import re
pattern = "\d*\.?\d+[Ee]?[+-]?\d*"
r = re.compile(pattern)
txt = """
12
.12
12.5
12.5E4
12.5e4
12.4E+4
12E4
12e-4
"""
x = r.findall(txt)
print(x)
for filtering all valid input from txt this code is fine
but invalid input such as
.12e, 12.3+4
are also allowed how can I fix this?
Try changing your regex to the following:
This makes it so that if the
eorEis there, there is always at least one digit, and so that+and-are only valid if they follow theeorE.Note that you should be using a raw string literal to make sure the backslashes are escaped properly (doesn’t affect this string in particular, but if you tried to use something like
\bin your regex you would see the difference):