I am using python’s re.findall method to find occurrence of certain string value in

Question

0

Asked: June 18, 20262026-06-18T14:17:37+00:00 2026-06-18T14:17:37+00:00

I am using python’s re.findall method to find occurrence of certain string value in

0

I am using python’s re.findall method to find occurrence of certain string value in Input string.
e.g. From search in ‘ABCdef’ string, I have two search requirements.

Find string starting from Single Capital letter.
After 1 find string that contains all capital letter.

e.g. input string and expected output will be:

'USA' -- output: ['USA']
'BObama' -- output: ['B', 'Obama']
'Institute20CSE' -- output: ['Institute', '20', 'CSE']

So My expectation from

>>> matched_value_list = re.findall ( '[A-Z][a-z]+|[A-Z]+' , 'ABCdef' )

is to return ['AB', 'Cdef'].

But which does Not seems to be happening. What I get is ['ABC'] as return value, which matches later part of regex with full string.

So Is there any way we can ignore found matches. So that once 'Cdef' is matched with '[A-Z][a-z]+'. second part of regex (i.e. '[A-Z]+') only matches with remaining string 'AB'?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-18T14:17:39+00:00

First you need to match AB, which is followed by an Uppercase alphabet and then a lowercase alphabet. or is at the end of the string. For that you can use look-ahead.

Then you need to match an Uppercase alphabet C, followed by multiple lowercase alphabets def.

So, you can use this pattern:

>>> s = "ABCdef"
>>> re.findall("([A-Z]+(?=[A-Z][a-z]|$)|[A-Z][a-z]+)", s)
['AB', 'Cdef']

>>> re.findall("([A-Z]+(?=[A-Z][a-z]|$)|[A-Z][a-z]+)", 'MumABXYZCdefXYZAbc')
['Mum', 'ABXYZ', 'Cdef', 'XYZ', 'Abc']

As pointed out in comment by @sotapme, you can also modify the above regex to: –

"([A-Z]+(?=[A-Z]|$)|[A-Z][a-z]+|\d+)"

Added \d+ since you also want to match digit as in one of your example. Also, he removed [a-z] part from the first part of look-ahead. That works because, + quantifier on the [A-Z] outside is greedy by default, so, it will automatically match maximum string, and will stop only before the last upper case alphabet.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am using python’s re.findall method to find occurrence of certain string value in

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply