$dom = new DOMDocument('1.0', 'UTF-8');
$dom->loadHTML($content);
$divs = $dom->getElementsByTagName("div");
foreach ( $divs as $div ) {
if ( $class = $div->attributes->getNamedItem("class") ) {
if ( $class->nodeValue == "simplegalleryholder" )
$div->parentNode->removeChild( $div );
}
}
$content = $dom->saveHTML();
This simple code should help me with removing
<div class="simplegalleryholder"> .... </div>
from the document. The only problem is, that $content contains utf8 encoded special characters (ąęść etc), that are destroyed by proces (i get iÄ™ Å‚ ż instead).
How should I approach this issue to get correct result?
Specifying
UTF-8in the constructor doesn’t make the underlying xml processing library process it as utf8. The following workaround is really hacky, but its works reasonably well.https://bugs.php.net/bug.php?id=32547
If you’re viewing the output in a web browser, send a real http header, not an http-equiv meta tag. This is only for viewing. processing with domdocument specifically needs the meta tag.