RegexOptions.IgnoreCase is more expensive than I would have thought (eg, should be barely measurable)
Assuming that this applies to PHP, Python, Perl, Ruby etc as well as C# (which is what I assume Jeff was using), how much of a slowdown is it and will I incur a similar penalty with /[a-zA-z]/ as I will with /[a-z]/i ?
Yes, [A-Za-z] will be much faster than setting the
RegexOptions.IgnoreCase, largely because of Unicode strings. But it’s also much more limiting — [A-Za-z] does not match accented international characters, it’s literally the A-Za-z ASCII set and nothing more.I don’t know if you saw Tim Bray’s answer to my message, but it’s a good one:
http://www.tbray.org/ongoing/When/200x/2003/10/11/SearchI18n