Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 1113205
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 17, 20262026-05-17T02:49:21+00:00 2026-05-17T02:49:21+00:00

I’m currently working on a Perl script to gather data from the QuakeLive website.

  • 0

I’m currently working on a Perl script to gather data from the QuakeLive website.
Everything was going fine until I couldn’t get a set of data.

I was using regexes for that and they work for everything apart from the favourite arena, weapon and game type. I just need to get the names of those three elements in a $1 for further processing.

I tried regexing up to the favorites image, but without succeeding. If it’s any use, I’m already using WWW::Mechanize in the script.

I think that the problem could be related to the class name of the paragraphs where those elements are, while the previous one was classless.

You can find an example profile HERE.

Note that for the previous part of the page, it worked using code like:

$content =~ /<b>Wins:<\/b> (.*?)<br \/>/;
$wins = $1;
print "Wins: $wins\n";
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-17T02:49:21+00:00Added an answer on May 17, 2026 at 2:49 am

    The immediate problem is that you have:

    <p class="prf_faves">
    <img src="http://cdn.quakelive.com/web/2010092807/images/profile/none_v2010092807.0.gif" 
         width="17" height="17" alt="" class="fl fivepxhr" />
                    <b>Arena:</b> Campgrounds
                    <div class="cl"></div>
                </p>
    

    That is, there is no <br /> following the value for favorites such as Arena. Now, the correct way to do this would involve using a proper HTML parser. The fragile solution is to adapt your pattern (untested):

    my ($favarena) = $content =~ m{<b>Arena:</b> ([^<]+)};
    

    That should put everything up to the < of the next <div> in $favarena. Now, if all arenas are single words with no spaces in them,

    my ($favarena) = $content =~ m{<b>Arena:</b> (\S+)};
    

    would save you the trouble of having to trim whitespace afterwards.

    Note that it is easy for such regex based solutions to be fooled with simple things like commented out snippets in the source. E.g., if the source were to be changed to:

    <p class="prf_faves">
    <img src="http://cdn.quakelive.com/web/2010092807/images/profile/none_v2010092807.0.gif" 
         width="17" height="17" alt="" class="fl fivepxhr" />
    <!-- <b>Arena: </b> here -->
                    <b>Arena:</b> Campgrounds
                    <div class="cl"></div>
                </p>
    

    your script would be in trouble where as a solution using an HTML parser would not.

    An example using HTML::TokeParser::Simple:

    #!/usr/bin/perl
    
    use strict; use warnings;
    
    use HTML::TokeParser::Simple;
    
    my $p = HTML::TokeParser::Simple->new( 'martianbuddy.html' );
    
    while ( my $tag = $p->get_tag('p') ) {
        next unless $tag->is_start_tag;
        next unless defined (my $class = $tag->get_attr('class'));
        next unless grep { /^prf_faves\z/ } split ' ', $class;
    
        my $fav = $p->get_tag('b');
        my $type = $p->get_text('/b');
        my $value = $p->get_text('/p');
        $value =~ s/\s+\z//;
    
        print "$type = $value\n";
    }
    

    Output:

    Arena:  Campgrounds
    Game Type:  Clan Arena
    Weapon:  Rocket Launcher

    And, here is an example using HTML::TreeBuilder:

    #!/usr/bin/perl
    
    use strict; use warnings;
    
    use HTML::TreeBuilder;
    use YAML;
    
    my $tree = HTML::TreeBuilder->new;
    $tree->parse_file('martianbuddy.html');
    
    my @p = $tree->look_down(_tag => 'p', sub {
            return unless defined (my $class = $_[0]->attr('class'));
            return unless grep { /^prf_faves\z/ } split ' ', $class;
            return 1;
        }
    );
    
    for my $p ( @p ) {
        my $text = $p->as_text;
        $text =~ s/^\s+//;
        my ($type, $value) = split ': ', $text;
        print "$type: $value\n";
    }
    

    Output:

    Arena: Campgrounds 
    Game Type: Clan Arena 
    Weapon: Rocket Launcher

    Given that the document is an HTML fragment rather than a full document, you will have more success with modules based on HTML::Parser rather than those that expect to operate on well-formed XML documents.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.