I have a string with a mix of uppercase and lowercase letters. I need want to find every lowercase letter that is surronded by 3 uppercase letters and extract it from the string.
For instance ZZZaZZZ I want to extract the a in the previous string.
I have written a script that is able to extract ZZZaZZZ but not the a alone. I know I need to use nested regex expressions to do this but I can not wrap my mind on how to implement this. The following is what I have:
import string, re
if __name__ == "__main__":
#open the file
eqfile = open("string.txt")
gibberish = eqfile.read()
eqfile.close()
r = re.compile("[A-Z]{3}[a-z][A-Z]{3}")
print r.findall(gibberish)
EDIT:
Thanks for the answers guys! I guess I should have been more specific. I need to find the lowercase letter that is surrounded by three uppercase letters that are exactly the same, such as in my example ZZZaZZZ.
You are so close! Read about the
.group* methods of MatchObjects. For example, if your script ended withthen you’d capture the desired character inside the first group.
To address the new constraint of matching repeated letters, you can use backreferences:
That reads like:
middle.middlegroup.