I came across this problem as I was working on the Python Challenge .

Question

0

Asked: June 8, 20262026-06-08T08:25:32+00:00 2026-06-08T08:25:32+00:00

I came across this problem as I was working on the Python Challenge .

0

I came across this problem as I was working on the Python Challenge. Number 10 to be exact. I decided to try and solve it using regexes – pulling out the repeating sequences, counting their length, and building the next item in the sequence off of that.

So the regex I developed was: '(\d)\1*'

It worked well on the online regex tester, but when using it in my script it didn’t perform the same:

regex = re.compile('(\d)\1*')
text = '111122223333'
re.findall(regex, text)

> ['1', '1', '1', '1', '2', '2', '2',...]

And so on and so forth. So I learn about raw type in the re module for Python. Which is my first question: can someone please explain what exactly this does? The doc described it as reducing the need to escape backslashes, but it doesn’t appear that it’s required for simpler regexes such as \d+ and I don’t understand why.

So I change my regex to r'(\d)\1*' and now try and use findall() to make a list of the sequences. And I get

> ['1', '2', '3']

Very confused again. I still don’t understand this. Help please?

I decided to do this to get around this:

[m.group() for m in regex.finditer(text)]
> ['1111', '2222', '3333']

And get what I’ve been looking for. Then, based off of this thread, I try doing findall() adding a group to the whole regex -> r'((\d)\2*)'.
I end up getting:

> [('1111', '1'), ('2222', '2'), ('3333', '3')]

At this point I’m all kinds of confused. I know that this result has something to do with multiple groups, but I’m just not sure.

Also, this is my first time posting so I apologize if my etiquette isn’t correct. Please feel free to correct me on that as well. Thanks!

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T08:25:33+00:00

Since this is the challenge I won’t give you a complete answer. You are on the right track however.

The finditer method returns MatchObject instances. You want to look at the .group() method on these and read the documentation carefully. Think about what the difference is between .group(0) and .group(1) there; plain .group() is the same as .group(0).

As for the \d escape character; because that particular escape combination has no meaning as a python string escape character, Python ignores it and leaves it as a backslash and letter d. It would indeed be better to use the r'' literal string format, as it would prevent nasty surprises when you do want to use a regular expression character set that also happens to be an escape sequence python does recognize. See the python documentation on string literals for more information.

Your .findall() with the r'((\d)\2*)' expression returns 2 elements per match as you have 2 groups in your pattern; the outer, whole group matching (\d)\2* and the inner group matching \d. From the .findall() documentation:

If one or more groups are present in the pattern, return a list of groups; this will be a list of tuples if the pattern has more than one group.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I came across this problem as I was working on the Python Challenge .

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply