I have a file containing perl-style regexs of the form /pattern/replace/ that I’m attempting to read into Python as a list of compiled patterns and their associated replacement strings. Below is what I’ve done so far.
def get_regex(filename):
regex = []
fi = open(filename,'r')
text = [l for l in fi.readlines() if not l.startswith("#")]
fi.close()
for line in text:
ptn, repl = line[1:].split('/')[:-1]
regex.append((re.compile(ptn), repl))
return regex
This works perfectly well until I get to lines with escaped forward slashes, like this:
/$/ <\\/a>/
When I try to split this string, Python returns a list of three elements, ['$', ' <\\', 's>'], rather than (the hoped for) ['$', ' <\\/s>']. Is there some way to make replace interpret the escapes?
Not really, no. Your best bet would probably be to use
re.split()instead, with a regex that uses a lookbehind to make sure a forward slash isn’t escaped, e.g.