Hey all, I’ve successfully created a website scraper getting the top 40 from the

Question

0

Asked: May 22, 20262026-05-22T20:45:30+00:00 2026-05-22T20:45:30+00:00

Hey all, I’ve successfully created a website scraper getting the top 40 from the

0

Hey all,
I’ve successfully created a website scraper getting the top 40 from the record industry website, however one of the columns in the table I’m scraping might sometimes not be there. Basically what I need is a way to remove any instances of this from my scrape:

<td><img src="/images/bullet_red.gif" width="8" height="8" title="Red Dot" /></td>

Here’s what I’ve got from a tutorial so far.

$url = "http://www.ariacharts.com.au/pages/charts_display_singles.asp?chart=1U50";
$raw = file_get_contents($url);
$newlines = array("\t","\n","\r","\x20\x20","\0","\x0B");

$content = str_replace($newlines, "", html_entity_decode($raw));

$start = strpos($content,'<table class="chartTable"');
$end = strpos($content,'</table>',$start) + 8;

$table = substr($content,$start,$end-$start);

preg_match_all("|<tr(.*)</tr>|U",$table,$rows);

foreach ($rows[0] as $row){

if ((strpos($row,'<th')===false)){

    preg_match_all("|<td(.*)</td>|U",$row,$cells);

    $number = strip_tags($cells[0][1]);

    $name = strip_tags($cells[0][5]);

    $artist = strip_tags($cells[0][6]);

    $name = strtolower($name);
    $name = ucwords($name);

    echo "{$artist} - {$name} - Number {$number} <br>\n";

}

}

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-22T20:45:31+00:00

Try using PHP Simple HTML DOM Parser instead of complex regex http://simplehtmldom.sourceforge.net/

require_once 'simple_html_dom.php';

$html = file_get_html('http://www.ariacharts.com.au/pages/charts_display_singles.asp?chart=1U50');
$table = $html->find('table.chartTable');

foreach ($table[0]->find('tr') as $row) {
    $columns = $row->find('td');
    if (sizeof($columns) < 7) continue;

    $number = $columns[1]->plaintext;
    $name = ucwords($columns[6]->plaintext);
    $artist = $columns[7]->plaintext;

    echo "$artist - $name - Number $number <br />\n";
}

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Hey all, I’ve successfully created a website scraper getting the top 40 from the

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply