<?php function walkDOM($node) { if (! isset($node->childNodes)) return; for ($i = 0; $i <

Question

0

Asked: June 2, 20262026-06-02T16:13:56+00:00 2026-06-02T16:13:56+00:00

<?php function walkDOM($node) { if (! isset($node->childNodes)) return; for ($i = 0; $i <

0

<?php
function walkDOM($node)
{
    if (! isset($node->childNodes))
        return;

    for ($i = 0; $i < $node->childNodes->length; $i++) {

        $childNode = $node->childNodes->item($i);
        $childNodeName = $childNode->nodeName;

        echo $childNode->nodeName . " - " . $childNode->nodeType . 
             " - \"" . $childNode->nodeValue . "\"\n";
        walkDOM($childNode);
    }
}

function processHTML($s)
{
    $doc = new DOMDocument('1.0', 'UTF-8');
    $success = $doc->loadHTML($s);
    if (! $success) {
        echo "Load HTML failed.\n";
        exit(1);
    }
    echo "Loaded HTML: " . $doc->saveHTML() . "\n";
    walkDOM($doc);
}

$s = '<div>hello, <p>world<big>!</big></p></div>';
processHTML($s);
?>

Output:

Loaded HTML: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><div>hello, <p>world<big>!</big></p></div></body></html>

html - 10 - ""
html - 1 - "hello, world!"
body - 1 - "hello, world!"
div - 1 - "hello, world!"
#text - 3 - "hello, "
p - 1 - "world!"
#text - 3 - "world"
big - 1 - "!"
#text - 3 - "!"

From the above code and output, we can see that when we access nodeValue property of any DOMNode we get its inner HTML with all tags stripped off. I can use this to filter out all tags as follows:

$s = '<div>hello, <p>world<big>!</big></p></div>';
$doc = new DOMDocument('1.0', 'UTF-8');
$doc->loadHTML($s);
echo $doc->childNodes->item(1)->nodeValue . "\n";

Output:

hello, world!

But I can do so using strip_tags as well:

$s = '<div>hello, <p>world<big>!</big></p></div>';
echo strip_tags($s) . "\n";

I have two questions:

Can I rely on this behavior of nodeValue property in future to strip tags or do any other kind of thing that I can imagine. Are there any hidden surprises?
How is using nodeValue to strip tags different from using strip_tags() to strip tags?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T16:13:56+00:00

Editorial Team

2026-06-02T16:13:56+00:00Added an answer on June 2, 2026 at 4:13 pm

If you just want to strip tags, whay complicate? I think it is faster, and better to use the strip_tags() function. Also, it can take a second parameter to specify tags which should not be stripped.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

<?php function walkDOM($node) { if (! isset($node->childNodes)) return; for ($i = 0; $i <

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply