I’m using Simple HTML DOM to extract data from a HTML document, and I

Question

0

Asked: June 14, 20262026-06-14T13:08:11+00:00 2026-06-14T13:08:11+00:00

I’m using Simple HTML DOM to extract data from a HTML document, and I

0

I’m using Simple HTML DOM to extract data from a HTML document, and I have a couple of issues that I need some help with.

On the line that begins with if ($td->find('a')) I want to extract the href and the content of the anchor node separately, and place them in separate variables. The code however doesn’t work (see output from echoes in the code below).

What is the best way to do this? Note that my purpose is to create a XML document out of the information later on, so I need the information in the correct order.

The links leads to pages containing detailed information about the different cars (e.g. “Max speed”, “Price” etc) that I also want to extract and put into separate variables. How can I get hold of data on these pages?

<?php
include 'simple_html_dom.php';

$html = new simple_html_dom();
$html = file_get_html('http://www.example.com/foo.html');

$items = array();

foreach ($html->find('table') as $table) {
    foreach ($table->find('tr') as $tr) {

        foreach ($tr->find('td') as $td) {

            if ($td->find('a')) {
                $link = $td->find('a.href');
                echo $link;  // empty

                $text = $td->find('a.text');
                echo $text; // Array
            }
            else {
                echo 'Name: ' . $td;
            }
        }
    }
}

The HTML document looks like this:

<div>
    <table>
        <tr>
            <td>
                <a href="car1.html" target="_blank">Car 1</a>
            </td>
            <td>
                Porsche
            </td>
        </tr>
        <tr>
            <td>
                <a href="car2.html" target="_blank">Car 2</a>
            </td>
            <td>
                Chrysler
            </td>
        </tr>
        ... and so on...

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-14T13:08:12+00:00

Editorial Team

2026-06-14T13:08:12+00:00Added an answer on June 14, 2026 at 1:08 pm

Use $td->find('a', 0)->href and $td->find('a', 0)->innertext to access element attributes in the first case, and contents in the second. Also, if there might be multiple anchor to be found, use 0 as a safe guard to always get the first one.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m using Simple HTML DOM to extract data from a HTML document, and I

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply