I have a HTML document (string) which contains a div with the class “foo”:
<html>
<head>
...
</head>
<body>
<div class="whatever">Blabla</div>
<div>
<span>Text</span>
</div>
<table>
<tr><td><div class="foo">GARBAGE</div></td></tr>
</table>
</body>
I only would like to remove all divs with class of “foo” and this is what I have so far:
$doc = new DOMDocument();
$doc->loadHTML($myhtml);
$xpath = new DOMXpath($doc);
$all = $xpath->query("/html");
$result = remove_elements_with_class('foo', $all);
How does the remove_elements_with_class function look like?
After:
You need to:
DOMNode::removeChild()on those nodesSo, to accomplish the first task, you can issue an XPath query that finds all of the
<div>nodes whose class isfoo. That query would look like:Note that this handles the cases where an element can have more than one class, i.e.
foo bar bazandbaz foo bar. If this is undesirable, and you only want to match the class exactly (so now only a class with exactlyfoowill match), the query becomes:And, in PHP, this becomes:
From here, you have all the nodes you want to remove in
$nodes, so just iterate over them, and remove them from the document by grabbing the<div>‘s parent node, and removing its child node:That’s all it takes! You can see it working in this demo.
Edit: To keep the
<div>and just remove the contents, set the node’snodeValueattribute to an empty string:You can see it working in this updated demo. You could also replace the
<div>with a newly created<div>, as that approach seems more bulletproof, but this should work for your use-case.