I am using a regex to replace all email addresses in a string with a nice <a> to make them clickable. This works perfect, except for the case when there are two words of a certain minimum length and a dash between them in front of the email address. Only then I get an empty string as result.
<?php
$search = '#(^|[ \n\r\t])(([a-z0-9\-_]+(\.?))+@([a-z0-9\-]+(\.?))+[a-z]{2,5})#si';
$replace = '\\1<a href="mailto:\\2">\\2</a>';
$string = "tttteeee-sssstttt mail@test.nl";
echo preg_replace($search, $replace, $string);
// Output: "" (empty)
$string = "te-st mail@test.nl";
echo preg_replace($search, $replace, $string);
// Output: "te-st <a href="mailto:mail@test.nl">mail@test.nl</a>" (as expected)
$string = "mail@test.nl tttteeee-sssstttt";
echo preg_replace($search, $replace, $string);
// Output: "<a href="mailto:mail@test.nl">mail@test.nl</a> tttteeee-sssstttt" (as expected)
?>
I have tried everything, but I really can’t find the problem. A solution would be removing the first dash in the regex (before the @ sign), but that way email addresses with a dash before the @ wouldn’t be highlighted.
OK, minimum use case:
#([a-z-]+\.?)+@#, which reaches the backtrack limit (usepreg_last_error()), it cannot determine where to put things, as the\.is optional, determining whether to use the inside or the outside+is a lot of work. The default limit ofpcre.backtrack_limitof 100000 does not work, setting it to 1000000 does.To solve this, make it easier on the parser: the first
(([a-z0-9\-_]+(\.?))+should become:([a-z0-9\-_]+(\.[a-z0-9\-_]+)*), which is a lot easier to solve internally. And as a bonus, instead of the accepted answer, this still doesn’t allow consecutive dots.