We allow some user-supplied REs for the purpose of filtering email. Early on we ran into some performance issues with REs that contained, for example, .*, when matching against arbitrarily-large emails. We found a simple solution was to s/\*/{0,1024}/ on the user-supplied RE. However, this is not a perfect solution, as it will break with the following pattern:
/[*]/
And rather than coming up with some convoluted recipe to account for every possible mutation of user-supplied RE input, I’d like to just limit perl’s interpretation of the * and + characters to have a maximum length of 1024 characters.
Is there any way to do this?
Update
Added a
(?<!\\)before the quantifiers, because escaped *+ should not be matched. Replacement will still fail if there is an\\*(match\0 or more times).An improvement would be this
See it here on Regexr
That means match
[*+]but only if there is no closing]ahead and no[till then. And there is no\(the(?<!\\)part) allowed before the square brackets.(?! ... )is a negative lookahead(?<! ... )is a negative lookbehindSee perlretut for details
Update 2 include possessive quantifiers
See it here on Regexr
Seems to be working, but its getting real complicated now!