I have an xml file with a header like the following:
<!ENTITY nbsp " "><!-- no-break space = non-breaking space,
U+00A0 ISOnum -->
<!ENTITY iexcl "¡"><!-- inverted exclamation mark, U+00A1 ISOnum -->
<!ENTITY cent "¢"><!-- cent sign, U+00A2 ISOnum -->
<!ENTITY pound "£"><!-- pound sign, U+00A3 ISOnum -->
<!ENTITY curren "¤"><!-- currency sign, U+00A4 ISOnum -->
<!ENTITY yen "¥"><!-- yen sign = yuan sign, U+00A5 ISOnum -->
<!ENTITY brvbar "¦"><!-- broken bar = broken vertical bar,
U+00A6 ISOnum -->
<!ENTITY sect "§"><!-- section sign, U+00A7 ISOnum -->
<!ENTITY uml "¨"><!-- diaeresis = spacing diaeresis,
U+00A8 ISOdia -->
<!ENTITY copy "©"><!-- copyright sign, U+00A9 ISOnum -->
<!ENTITY ordf "ª"><!-- feminine ordinal indicator, U+00AA ISOnum -->
<!ENTITY laquo "«"><!-- left-pointing double angle quotation mark
= left pointing guillemet, U+00AB ISOnum -->
<!ENTITY not "¬"><!-- not sign, U+00AC ISOnum -->
<!ENTITY shy "­"><!-- soft hyphen = discretionary hyphen,
U+00AD ISOnum -->
<!ENTITY reg "®"><!-- registered sign = registered trade mark sign,
U+00AE ISOnum -->
<!ENTITY macr "¯"><!-- macron = spacing macron = overline
= APL overbar, U+00AF ISOdia -->
<!ENTITY deg "°"><!-- degree sign, U+00B0 ISOnum -->
<!ENTITY plusmn "±"><!-- plus-minus sign = plus-or-minus sign,
U+00B1 ISOnum -->
When I try to load it into a dom document, it doesn’t seem to save it to file. I think the above code may be causing parsing erros. Is there a way to remove these headers?
This is my php code:
$xml = curl_exec($ch);
$srcDom = new DOMDocument;
$srcDom->load($xml);
$xPath = new DOMXPath($srcDom);
foreach ($srcDom->getElementsByTagName('Venue') as $venue) {
$dstDom = new DOMDocument('1.0', 'utf-8');
$dstDom->appendChild($dstDom->createElement('EventsPricePoints'));
$dstDom->documentElement->appendChild($dstDom->importNode($venue, true));
$allEventsForVenue = $xPath->query(
sprintf(
'/Store/EventsPricePoints/Event[VenueID/@ID=%d]',
$venue->getAttribute('ID')
)
);
foreach ($allEventsForVenue as $event) {
$dstDom->documentElement->appendChild($dstDom->importNode($event, true));
}
$dstDom->formatOutput = true;
$dstDom->saveXml(sprintf('/var/www/html/venuexml/%d.xml', $venue->getAttribute('ID')));
}
Your code is not causing parsing errors (most likely not, if you enable error logging or reporting, you might have seen a warning, but I don’t think it’s the case).
Instead, your code loads and as XML per default is UTF-8 encoded, all those entities you use do not have to be transported as the XML can contain the characters of those entities without the need of these.
Therefore both the definition as well as the entities itself inside the XML are superfluous. I guess
DOMDocumentwill just remove those.Additionally if you would have given an example XML chunk for testing purposes, you would have gotten a more concrete answer for your clarification needs.