I have this string
circle,4.5
square,3.1
circle,2.0
triangle,4.7,4.9
square,4.1
circle,4.3
Lets say I want to capture The name of the shape and the two numbers next to it. I’ve tried this and will comment about the issue i have inside it:
>>> ma = re.search(r"(\w+)[,(\d+.\d+)]+", "Triangle,3.4,1.2")
>>> ma.group()
'Triangle,3.4,1.2'
>>> ma.group(1)
'Triangle'
>>> ma.group(2) ##Why is this happening ???
Traceback (most recent call last):
File "<pyshell#29>", line 1, in <module>
ma.group(2)
IndexError: no such group
I guess i can’t put capturing groups inside square brackets ?
Square brackets are special; they mark all characters inside of them as a character group. You are asking to match either a number (
\d), a,comma, a.full stop, a(opening parenthesis or a)closing parenthesis. In other words, the opening and closing parenthesis are part of the matched characters, not denoting a capturing group.You don’t need to use a character class at all here, you are looking for a more specific pattern of number, follewed by a full stop followed by another number. Use a non-capturing group (
(?:...)) to group the number format together with the comma to match repeating groups of numbers:Unfortunately, this still won’t capture more than one group for you; regular expressions will never produce a variable number of groups. We’ve defined only two group here, so that’s all we get:
See Regex question about parsing method signature and python regex repetition with capture question for other SO questions that ran into this limitation.
Your format is actually very simple, and you’d be much better off not using regular expressions at all. Simply split by the
,comma and be done with it: