From Perl’s documentation : study takes extra time to study SCALAR ($_ if unspecified)

Question

0

Asked: May 30, 20262026-05-30T19:49:04+00:00 2026-05-30T19:49:04+00:00

From Perl’s documentation : study takes extra time to study SCALAR ($_ if unspecified)

0

study takes extra time to study SCALAR ($_ if unspecified) in anticipation of doing
many pattern matches on the string before it is next modified. This may or may not save
time, depending on the nature and number of patterns you are searching and the distribution
of character frequencies in the string to be searched;

I’m trying to speed up some regular expression-driven parsing that I’m doing in Python, and I remembered this trick from Perl. I realize I’ll have to benchmark to determine if there is a speedup, but I can’t find an equivalent method in Python.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-30T19:49:05+00:00

As far as I know there’s nothing like this built into Python. But according to the perldoc:

The way study works is this: a linked list of every character in the
string to be searched is made, so we know, for example, where all the
‘k’ characters are. From each search string, the rarest character is
selected, based on some static frequency tables constructed from some
C programs and English text. Only those places that contain this
“rarest” character are examined.

This doesn’t sound very sophisticated, and you could probably hack together something equivalent yourself.

esmre is kind of vaguely similar. And as @Frg noted, you’ll want to use re.compile if you’re reusing a single regex (to avoid re-parsing the regex itself over and over).

Or you could use suffix trees (here’s one implementation, or here’s a C extension with unicode support) or suffix arrays (implementation).

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

From Perl’s documentation : study takes extra time to study SCALAR ($_ if unspecified)

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply