I am trying to capture / extract numeric values from some strings.
Here is a sample string:
s='The shipping company had 93,999,888.5685 gallons of fuel on hand'
I want to pull the 93,999,888.5685 value
I have gotten my regex to this
> mine=re.compile("(\d{1,3}([,\d{3}])*[.\d+]*)")
However, when I do a findall I get the following:
mine.findall(s)
[('93,999,888.5685', '8')]
I have tried a number of different strategies to keep it from matching on the 8
But I am now realizing that I am not sure I know why it matched on the 8
Any illumination would be appreciated.
The reason the 8 is being captured is because you have 2 capturing groups. Mark the 2nd group as a non-capturing group using
?:with this pattern:(\d{1,3}(?:[,\d{3}])*[.\d+]*)Your second group,
([,\d{3}])is responsible for the additional match.