im using pQuery to get all TD cells from a table checking if it contains a valid URL.
pQuery is working fine giving me the content of all TD cells.
But my Regexp::Common check which i have from stackoverflow doesnt work.
Heres my code:
use Regexp::Common qw/URI/;
use pQuery;
pQuery( $url)
->find( "table")
->find( "tr")
->find( "td")
->each( sub {
my $domain = pQuery( $_)->text;
if( $domain =~ /$RE{URI}{HTTP}/) {
print "OK\n";
}
});
The variable $domain contains the content of a TD cell, some of them have domains in it.
They all look like “hello-world.com” or “www.test.net”.
The text “OK” doesnt get printed. Whats wrong here?
Is it because the domains are in the format above? No HTTP, no WWW. I want a simple check if the text is a valid URL.
Your content is not a HTTP URI, so using
$RE{URI}{HTTP}will not match, it looks like your trying to match domain names, for that you want to useuse Regexp::Common qw/net/;and$RE{net}{domain}{-nospace}.