I have a regular expression to find :ABC:`hello` pattern. This is the code.
format =r".*\:(.*)\:\`(.*)\`"
patt = re.compile(format, re.I|re.U)
m = patt.match(l.rstrip())
if m:
...
It works well when the pattern happens once in a line, but with an example “:tagbox:`Verilog` :tagbox:`Multiply` :tagbox:`VHDL`”. It finds only the last one.
How can I find all the three patterns?
EDIT
Based on Paul Z’s answer, I could get it working with this code
format = r"\:([^:]*)\:\`([^`]*)\`"
patt = re.compile(format, re.I|re.U)
for m in patt.finditer(l.rstrip()):
tag, value = m.groups()
print tag, ":::", value
Result
tagbox ::: Verilog
tagbox ::: Multiply
tagbox ::: VHDL
Yeah, dcrosta suggested looking at the
remodule docs, which is probably a good idea, but I’m betting you actually wanted thefinditerfunction. Try this:Your current solution always finds the last one because the initial
.*eats as much as it can while still leaving a valid match (the last one). Incidentally this is also probably making your program incredibly slower than it needs to be, because.*first tries to eat the entire string, then backs up character by character as the remaining expression tells it “that was too much, go back”. Usingfinditershould be much more performant.