I’ve got a simple JavaScript regex check (written by other developer) that works perfectly on thousands of different strings. However I’ve just discovered one particular string value that’s causing that regex to take as long as 10min to execute in Firefox/IE which is unacceptable. I’ve extracted an actual regex call into small code snippet for your convenience:
<html>
<script>
function dodo(){
var mask = /^([\w'#@\-\&\(\)\/.]+[ ]*){1,100}$/;
var value = "Optometrists Association Australia, Queensland/NT Division";
mask.exec(value);
}
</script>
<body>
<input type="button" value="Click" onclick="dodo()">
</body>
</html>
What is the problem here? If I change value to anything else it works perfectly.
Thank you!
This looks like a poor application for a regex, and a poor regex to boot. The intent seems to be to match a list of between 1 and 100 space-separated “words”, I think. Here are the core problems I can see:
The use of “[ ]*” at the end of the word, instead of “[ ]+” means that every byte can potentially be a “word” alone, whether it’s bounded by spaces or not. That’s a lot of match cases for your engine to keep track of.
You’re using capturing parentheses (“(…)”) instead non-capturing ones (“(?:…)”). The grouping will be doing yet more bookeeping to save the last word matched for you, which you probably doing need or not.
And some minor issues:
The “[ ]*” expression is redundant. Just use ” *” to match zero or more spaces. But you probably want “\s” there, to match whitespace of any type, not just a space.
The expression allows whitespace at the end of the string, but not the beginning. Most applications usually want to tolerate both or neither.
For readability, don’t use backslash escaping where it’s not needed. Only the “-” in your bracket actually needs it.
What’s magic about 100? Do you really want to hard-code that limit?
Finally, why use a regex here at all? Why not simply split() on whitespace into an array of substrings, and then test each resulting word against a simpler expression?