I’m writing a program that utilises a page’s source code, however in certain instances I want to ignore parts of it. Ultimately I want to remove the tag which has id navigation and all of its contents, and then output the HTML.
Editted code:
<?php
$lol = new DOMDocument();
$fh = fopen("test.txt", "r");
$lol->loadHTML(fread($fh, filesize("test.txt")));
$lol->saveHTML();
$xpath = new DOMXpath($lol);
$nodeList = $xpath->query('//navigation');
foreach ($nodeList as $element) {
$element->parentNode->removeChild($element);
}
/*
foreach($divs AS $div) {
if($div->getAttribute('id') == "navigation") {
$lol->removeChild($div);
}
}
*/
$out = $lol->saveHTML();
echo $out;
?>
From what I’ve read online I would have expected this to work, but doesn’t.
Any suggestions appreciated.
test.txt is just a text file with source code of the page it.
It looks like the comments have got you most of the way there. It just looks like the XPath needs a little tweaking.
$xpath->query('//navigation')will search for<navigation>tags, while you are looking for tags with the id navigation:XPath is pretty powerful for this sort of thing, this W3 tutorial is a good place to start learning some more.
(if that doesn’t work I’ll echo the calls to post the relevant HTML).