From Perl’s documentation:
studytakes extra time to study SCALAR ($_ if unspecified) in anticipation of doing
many pattern matches on the string before it is next modified. This may or may not save
time, depending on the nature and number of patterns you are searching and the distribution
of character frequencies in the string to be searched;
I’m trying to speed up some regular expression-driven parsing that I’m doing in Python, and I remembered this trick from Perl. I realize I’ll have to benchmark to determine if there is a speedup, but I can’t find an equivalent method in Python.
As far as I know there’s nothing like this built into Python. But according to the perldoc:
This doesn’t sound very sophisticated, and you could probably hack together something equivalent yourself.
esmre is kind of vaguely similar. And as @Frg noted, you’ll want to use
re.compileif you’re reusing a single regex (to avoid re-parsing the regex itself over and over).Or you could use suffix trees (here’s one implementation, or here’s a C extension with unicode support) or suffix arrays (implementation).