The string s is bigger, but I have shortened it to simplify.
>>> import re
>>> s = "Blah. Tel.: 555 44 33 22."
>>> m = re.search(r"\s*Tel\.:\s*(?P<telephone>.+?)\.", s)
>>> m.group("telephone")
'555 44 33 22'
The code above works, but if I wrap the regex in ()? to make it optional, I don’t get any telephone.
>>> m = re.search(r"(\s*Tel\.:\s*(?P<telephone>.+?)\.)?", s)
>>> m
<_sre.SRE_Match object at 0x9369890>
>>> m.group("telephone")
What’s the problem here? Thanks!
Edit:
This is part of a larger regular expression in which I’m getting many values from every line of a big file.
regex = r"^(?P<title>.[^(]+);" \
"\s*(?P<subtitle>.+)\." \
"\s*Tel\.:\s*(?P<telephone>.+?)(\.|;)" \
"\s*(?P<url>(www\.|http://).+?\.[a-zA-Z]+)(\.|;)" \
"(\s*(?P<text>.+?)\.)?" \
"\s*coor:(\s*(?P<lat>.+?),\s*(?P<long>.+?))?$"
One sample line could be:
l = "Title title; Subtitle, subtitle. Tel.: 555 33 44 11. www.url.com. coor: 11.11111, -2.222222
And other sample line:
l = "Title2 title; Subtitle2, subtitle. Tel.: 555 33 44 11. www.url2.com. coor: 44.444444, -6.66666
It’s a really big regex, so that’s why I didn’t post it.
Your regex is too unspecific in what the
titleandsubtitlebits are matching. They are gobbling up the telephone part, and if that is made optional, it continues at the next part of the regex (and succeeds). Only if it’s not optional, the regex engine has to backtrack so it can find an overall match.Try