I try to match email addresses but only when they are not preceeded with “mailto:”. I try this regular expression:
"/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/"
against this string:
'<a href="mailto:someemail@domain.com">EMAIL</a> ... otheremail@domain.com '
I would expect to catch only 'otheremail@domain.com', but I also receive 'omeemail@domain.com' – see missing 's'. I wonder what’s wrong here. Can’t I have a normal regex after the lookbehind assertion?
My whole example in PHP looks like:
$testString = '<a href="mailto:someemail@domain.com">EMAIL</a> ... otheremail@domain.com ';
$pattern = "/(?<!mailto:)[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})/";
preg_match_all($pattern, $testString, $matches);
echo('<pre>');print_r($matches);echo('</pre>');
Thank you!
Because after
sthere is a string that matches your regex,omeemail@domain.com, and becausesis hardlymailto:it matches. Getting a word boundary in there will work for most cases:Change:
To:
On a side note: use example.com for examples, domain.com is owned by an actual company.