I’m trying to parse some parts of an HTML page but I have problems

Question

0

Asked: June 4, 20262026-06-04T00:36:49+00:00 2026-06-04T00:36:49+00:00

I’m trying to parse some parts of an HTML page but I have problems

0

I’m trying to parse some parts of an HTML page but I have problems with my regular expression.
My code looks like this:

... Download page using wget and some other stuff ...

$PAGE_REGEXP = "\<div class="col bg_dark clear">";

#Array HTMLLines
@HTMLLines = split(/\n/, $Page);
foreach $ThisOne (@HTMLLines) {
    if ( ($Team) = ($ThisOne =~ /$PAGE_REGEXP/) ) {
        $T{TranslateTeams($Team)}++;
        $LastTeam=TranslateTeams($Team);
    };
};

This is the HTML page:

<div class="col bg_dark clear">
    <div class="col_1 left">15:30</div>
    <div class="col_3_archive left">Team A - Team B</div>
    <div class="col_2_archive left">
            1:4 (0:2)&nbsp;
    </div>

    <div class="col_5 left ">2.4&nbsp;</div>
    <div class="col_5 left ">3.6&nbsp;</div>
    <div class="col_5 left bold">2.9&nbsp;</div>
    <div class="col_8 left">
</div>

<div class="col  clear">
    <div class="col_1 left">15:30</div>
    <div class="col_3_archive left">Team C - Team D</div>
    <div class="col_2_archive left">
            2:3 (1:1)&nbsp;
    </div>

    <div class="col_5 left ">2.7&nbsp;</div>
    <div class="col_5 left ">3.7&nbsp;</div>
    <div class="col_5 left bold">2.5&nbsp;</div>
    <div class="col_8 left">
</div>

The informations I need to parse are the team names, the end and halftime result and the numbers in e.g., col_5_left: 2.4, 3.6 and 2.9(for the game Team A – Team B).

If I start my script, Perl gives me following error:
Bareword found where operator expected at parser.pl line 11, near “”\

I’m not familiar with all existing modules in Perl, maybe I’m trying to do something which is quite easily to achieve using
the correct module. Can anybody please provide me some hints/tips how to parse this HTML page?

Thx

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-04T00:36:52+00:00

Editorial Team

2026-06-04T00:36:52+00:00Added an answer on June 4, 2026 at 12:36 am

The line with regexp should probably look something like this:

$PAGE_REGEXP = '<div class="col bg_dark clear">';

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m trying to parse some parts of an HTML page but I have problems

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply