Consider a string s = aa,bb11,22 , 33 , 44,cc , dd . I

Question

0

Asked: June 8, 20262026-06-08T04:07:44+00:00 2026-06-08T04:07:44+00:00

Consider a string s = aa,bb11,22 , 33 , 44,cc , dd . I

0

Consider a string s = "aa,bb11,22 , 33 , 44,cc , dd ".

I would like to split s into the following list of tokens using the regular expressions module in Python, which is similar to the functionality offered by Perl:

"aa,bb11"
"22"
"33"
"44,cc , dd "

Note:

I want to tokenise on commas, but only if those commas have numbers to either side.
Any (optional) whitespace around these “numerical commas” that I’m targeting should be removed in the result. The optional whitespace may be more than a single space.
Any other whitespace should be left as it appears in the original string.

My best attempt so far is the following:

import re

pattern = r'(?<=\d)(\s*),(\s*)(?=\d)'
s = 'aa,bb11,22 , 33 , 44,cc , dd '

print re.compile(pattern).split(s)

but this prints:

['aa,bb11', '', '', '22', ' ', ' ', '33', ' ', ' ', '44,cc , dd ']

which is close to what I want, inasmuch as the 4 things I want are contained in the list. I could go through and get rid of any empty strings and any strings that consist of only spaces/commas, but I’d rather have a single line regex that does all this for me.

Any ideas?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-08T04:07:46+00:00

Editorial Team

2026-06-08T04:07:46+00:00Added an answer on June 8, 2026 at 4:07 am

Don’t put capture groups on the \s*:

pattern = r'(?<=\d)\s*,\s*(?=\d)'

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Consider a string s = aa,bb11,22 , 33 , 44,cc , dd . I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply