I have this regex to filter out urls but its also filtering out some invalid urls
$regexUrl = "((https?|ftp)\:\/\/)?"; // SCHEME
$regexUrl .= "([a-zA-Z0-9+!*(),;?&=\$_.-]+(\:[a-zA-Z0-9+!*(),;?&=\$_.-]+)?@)?"; // User and Pass
$regexUrl .= "([a-zA-Z0-9-.]*)\.([a-zA-Z]{2,3})"; // Host or IP
$regexUrl .= "(\:[0-9]{2,5})?"; // Port
$regexUrl .= "(\/([a-zA-Z0-9+\$_-]\.?)+)*\/?"; // Path
$regexUrl .= "(\?[a-zA-Z+&\$_.-][a-zA-Z0-9;:@&%=+\/\$_.-]*)?"; // GET Query
$regexUrl .= "(#[a-zA-Z_.-][a-zA-Z0-9+\$_.-]*)?"; // Anchor
for instance “http://…XYZ” is also filtered by the above regex but this is invalid url.
Any help would be appreciated
In your Host or IP line, change
*to+and remove the.from the first[]The effect of this is to require (with +) some characters from the first
[]and not permit a.among them since the.is handled (and required) by the\.which follows the first group.