I’m trying to write a perl script to log into a password protected site. I’ve used the WWW::Mechanize module for similar tasks in the past, but this this site is different in a couple of ways:
-It uses javascript on the protected pages so mechanize won’t work. I’d prefer to implement something with headless browser as the script runs hourly on my work machine.
-It has no login form, rather the browser displays a popup box to log in and I can’t for the life of me figure out how to deal with it.
The url:https://fwxwww2.hpr.for.gov.bc.ca/Scripts/Public/Common/Report.asp?Report=Hourly
I’ve found enough good resources for how to proceed with the javascript after logging in, so it’s really just how to authenticate that’s got me stumped. Thanks in advance for any suggestions on how to approach this. I’m open to solutions that don’t involve perl, but I’m running cygwin so options are somewhat limited.
#!/usr/bin/perl
use strict;
use warnings;
use WWW::Mechanize;
use Data::Dumper;
my $url= 'https://fwxwww2.hpr.for.gov.bc.ca/Scripts/Public/Common/Report.asp?Report=Hourly';
my $mech = WWW::Mechanize->new( autocheck => 1 );
$mech->credentials(
'myusername',
'mypassword'
);
$mech->get( $url );
print $mech->content();
This is HTTP authentication, described in RFC 2617 and documented in the method
credentialsinWWW::MechanizeandLWP::UserAgent.I see no JavaScript involved. Maybe only in the documents after authentication. See my documentation improvement for JS-enabled Mech-workalikes.
edit:
Antonio Dolcetta’s answer gives the hint that the NTLM authentication scheme is used. Upgrade your version of Authen::NTLM. As per the LWP::Authen::Ntlm documentation, enable keep-alive and use the correct notation for netloc (including port number) and username (including NT domain name).