From man perlre:
The “*” quantifier is equivalent to “{0,}”, the “+” quantifier to “{1,}”, and the “?” quantifier to “{0,1}”. n and m are limited to integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms. The actual limit can be seen in the error message generated by code such as this:
$_ **= $_ , / {$_} / for 2 .. 42;
Ay that’s ugly – Isn’t there some constant I can get instead?
Edit: As daxim pointed out (and perlretut hints towards) it might be that 32767 is a magical hardcoded number. A little searching in the Perl code goes a long way, but I’m not sure how to get to the next step and actually find out where the default reg_infty or REG_INFTY is actually set:
~/dev/perl-5.12.2
$ grep -ri 'reg_infty.*=' *
regexec.c: if (max != REG_INFTY && ST.count == max)
t/re/pat.t: $::reg_infty = $Config {reg_infty} // 32767;
t/re/pat.t: $::reg_infty_m = $::reg_infty - 1;
t/re/pat.t: $::reg_infty_p = $::reg_infty + 1;
t/re/pat.t: $::reg_infty_m = $::reg_infty_m; # Surpress warning.
Edit 2: DVK is of course right: It’s defined at compile time, and can probably be overridden only with REG_INFTY.
Summary: there are 3 ways I can think of to find the limit: empirical, “matching Perl tests” and “theoretical”.
Empirical:
This seems obvious enough that it doesn’t require explanation.
Matches Perl tests:
Perl has a series of tests for regex, some of which (in
pat.t) deal with testing this max value. So, you can approximate that the max value computed in those tests is “good enough” and follow the test’s logic:The explanation of where in the tests this is based off of is in below details.
Theoretical: This is attempting to replicate the EXACT logic used by C code to generate this value.
This is harder that it sounds, because it’s affected by 2 things: Perl build configuration and a bunch of C
#definestatements with branching logic. I was able to delve fairly deeply into that logic, but was stalled on two problems: the#ifdefsreference a bunch of tokens that are NOT actually defined anywhere in Perl code that I can find – and I don’t know how to find out from within Perl what thosedefines values were, and the ultimate default value (assuming I’m right and those#ifdefs always end up with the default) of#define PERL_USHORT_MAX ((unsigned short)~(unsigned)0)(The actual limit is gotten by removing 1 bit off that resulting all-ones number – details below).I’m also not sure how to access the amount of bytes in
shortfrom Perl for whichever implementation was used to buildperlexecutable.So, even if the answer to both those questions can be found (which I’m not sure of), the resulting logic would most certainly be “uglier” and more complex than the straightforward “empirical eval-based” one I offered as the first option.
Below I will provide the details of where various bits and pieces of logic related to to this limit live in Perl code, as well as my attempts to arrive at “Theoretically correct” solution matching C logic.
OK, here is some investigation part way, you can complete it yourself as I have ti run or I will complete later:
From
regcomp.c:vFAIL2("Quantifier in {,} bigger than %d", REG_INFTY - 1);So, the limit is obviously taken from
REG_INFTYdefine. Which is declared in:rehcomp.h:Please note that SHORTSIZE is overridable via
Config– I will leave details of that out but the logic will need to include$Config{shortsize}🙂From handy.h (this doesn’t seem to be part of Perl source at first glance so it looks like an iffy step):
I could not find ANY place which defined
INT16_MAXat all 🙁Someone help please!!!
PERL_SHORT_MAX is defined in perl.h:
I wasn’t able to find any place which defined SHORT_MAX, MAXSHORT or SHRT_MAX so far. So the default of
((short) (PERL_USHORT_MAX >> 1))it is assumed to be for now 🙂PERL_USHORT_MAX is defined very similarly in
perl.h, and again I couldn’t find a trace of definition ofUSHORT_MAX/MAXUSHORT/USHRT_MAX.Which seems to imply that it’s set by default to:
#define PERL_USHORT_MAX ((unsigned short)~(unsigned)0). How to extract that value from Perl side, I have no clue – it’s basically a number you get by bitwise negating a short 0, so if unsigned short is 16 bytes, thenPERL_USHORT_MAXwill be 16 ones, andPERL_SHORT_MAXwill be 15 ones, e.g. 2^15-1, e.g. 32767.Also, from
t/re/pat.t(regex tests):$::reg_infty = $Config {reg_infty} // 32767;(to illustrate where the non-default compiled in value is stored).So, to get your constant, you do: