I came across this excellent “little” RegEx to replace URLs (not hyper-links) in plain text.
Only problem is I know very little RegEx so I’m completly stuck in getting this working for my blog.
So, I’m asking for help with excluding a URL e.g., $exception_url = 'http://mysite.com'
function strip_urls($text, $xception_url = FALSE)
{
return preg_replace("/( (?:
(?:https?|ftp) : \\/*
(?:
(?: (?: [a-zA-Z0-9-]{2,} \\. )+
(?: arpa | com | org | net | edu | gov | mil | int | [a-z]{2}
| aero | biz | coop | info | museum | name | pro
| example | invalid | localhost | test | local | onion | swift ) )
| (?: [0-9]{1,3} \\. [0-9]{1,3} \\. [0-9]{1,3} \\. [0-9]{1,3} )
| (?: [0-9A-Fa-f:]+ : [0-9A-Fa-f]{1,4} )
)
(?: : [0-9]+ )?
(?! [a-zA-Z0-9.:-] )
(?:
\\/
[^&?#\\(\\)\\[\\]\\{\\}<>\\'\\\"\\x00-\\x20\\x7F-\\xFF]*
)?
(?:
[?#]
[^\\(\\)\\[\\]\\{\\}<>\\'\\\"\\x00-\\x20\\x7F-\\xFF]+
)?
) | (?:
(?:
(?: (?: [a-zA-Z0-9-]{2,} \\. )+
(?: arpa | com | org | net | edu | gov | mil | int | [a-z]{2}
| aero | biz | coop | info | museum | name | pro
| example | invalid | localhost | test | local | onion | swift ) )
| (?: [0-9]{1,3} \\. [0-9]{1,3} \\. [0-9]{1,3} \\. [0-9]{1,3} )
)
(?: : [0-9]+ )?
(?! [a-zA-Z0-9.:-] )
(?:
\\/
[^&?#\\(\\)\\[\\]\\{\\}<>\\'\\\"\\x00-\\x20\\x7F-\\xFF]*
)?
(?:
[?#]
[^\\(\\)\\[\\]\\{\\}<>\\'\\\"\\x00-\\x20\\x7F-\\xFF]+
)?
) | (?:
[a-zA-Z0-9._-]{2,} @
(?:
(?: (?: [a-zA-Z0-9-]{2,} \\. )+
(?: arpa | com | org | net | edu | gov | mil | int | [a-z]{2}
| aero | biz | coop | info | museum | name | pro
| example | invalid | localhost | test | local | onion | swift ) )
| (?: [0-9]{1,3} \\. [0-9]{1,3} \\. [0-9]{1,3} \\. [0-9]{1,3} )
)
) )/Dx", '', $text);
}
Would be very grateful for an answer, thank you.
Changing the regex would be nearly impossible and would end up being huge.
You can, however, temporarily replace the portions of the exception URL which identify it as a URL with some bogus strings and then replace them back after the regex (and if you REALLY want to be paranoid, you can make sure the replacement strings don’t already exist in the text (or wouldn’t exist after URL stripping), and if they do, attach a random number until they don’t):