I have some strings that are valid in my database but when I include them in an attribute of a UTF-8 XML output they give me the following error:
XML Parsing Error: not well-formed
My current code (simplified):
header('Content-Type: text/xml');
echo '<?xml version="1.0" encoding="UTF-8" standalone="yes"?>';
echo '<root attribute="' . htmlentities($string_from_hell) . '">';
How should I format these strings before including them in XML attributes?
A possible value for $string_from_hell:  (don’t know if it will show up properly)
Try
htmlentitieswon’t do because it will create HTML entities that are not recognized in XML, only HTML. You should also specify the charset because the default is not UTF-8, it’s the ISO-8859-1.You’re also missing the quotes (
") around the attribute value.There are also better ways to create XML files that handle escaping for you. See e.g. XMLWriter.