I wrote a simple perl script with the WWW::Selenium module that interacts with the Selenium RC server and goes to a webpage and downloads the source. I am able to do this with HTML pages. However, I have an XML page I want to download the source of. This is obviously not possible with the ‘get_source_html()’ function. Below is the script of what i want to do:
#!/usr/bin/perl -sw
use WWW::Selenium;
print "\n setting up Selenium...\n";
my $sel = WWW::Selenium->new( host => "localhost",
port => 4444,
browser => "*firefox",
browser_url => "http://www.google.com",
);
print " starting Selenium...\n";
$sel->start;
$sel->open('someXMLpage...');
$sel->wait_for_page_to_load();
my $xml = $sel->get_html_source();
print $xml;
As you can see the get_html_source is obviously a problem since it will return an error saying that the page is not html. Is there some way that I can just download the current page visible in the browser regardless of the type of page(like the click ‘view source’ in firefox or even better: some get_source() function)? Also note, that the url I need to use does not end in something like a ‘.xml’ file. The page is generated on the fly if that means anything…
Any wisdom greatly appreciated!
You want the Selenium RC
get_page_source()function. It works even if the “page” isn’t HTML (even plain text, not just XML).