I need some help to tweak this regular expression:
$content = 'more <a href="http://www.test.com">test</a> test <a href="mailto:jeff@test.com">Jeff</a> this is a <a href="http://www.test.com">test</a>';
$content = preg_replace("~<a .*?href=[\'|\"]mailto:(.*?)[\'|\"].*?>.*?</a>~", "$1", $content);
This expression is to strip the html markup off a mailto link and just return the email (jeff@test.com)
It works fine except for in the example I gave above – because a unlimited number of whitespaces is allowed before the href in the pattern, when a website link is before the mailto link, the regex looks all the way forward until it finds the mailto: in the following link and removes all the content in between.
maybe a fix would be to just limit it to two or three whitespaces after the opening tag so as to not look so far ahead, but i wonder if there is a better solution from people who know regex better than I?
The problem is not to allow any amount of whitespace, that would be working. The problem is you allow one space and any amount of ANY character with your
<a .*If you fix this and allow really only whitespace like this
it seems to work.
See it here at Regexr
But probably you should have a closer look at alex answer (+1 for the example) as this would be the cleaner solution.