I am reading the documentation for creating a podcast feed suitable for iTunes ,

Question

0

Asked: May 15, 20262026-05-15T14:22:49+00:00 2026-05-15T14:22:49+00:00

I am reading the documentation for creating a podcast feed suitable for iTunes ,

0

I am reading the documentation for creating a podcast feed suitable for iTunes, and the Common Mistakes section says:

Using HTML Named Character Entities.

<! — illegal xml — >
<copyright>&copy; 2005 John Doe</copyright>

<! — valid xml — >
<copyright>&#xA9; 2005 John Doe</copyright>

Unlike HTML, XML supports only five
“named character entities”:

character   name               xml
&           ampersand          &amp;
<           less-than sign     &lt;
>           greater-than sign  &gt;
’           apostrophe         &apos;
"           quotation          &quot;

The five characters above are the only
characters that require escaping in
XML. All other characters can be
entered directly in an editor that
supports UTF-8. You can also use
numeric character references that
specify the Unicode for the character,
for example:

character   name                       xml
©           copyright sign             &#xA9;
℗           sound recording copyright  &#x2117;
™           trade mark sign            &#x2122;

For further reference see XML
Character and EntityReferences.

Right now I’m using htmlentities() under PHP5 and the feed is validating and working. But from what I gather some things that could get put into content might become entities that would make it no longer be valid. What’s the best function to use to assure I’m not passing along bad data? I’m paranoid something will get entered and get entity-ized and break the feed — should I just use str_replace() and replace with named entities and leave the rest alone? Or can I use htmlspecialchars() somehow?

So in short, what’s a drop-in replacement for htmentities() that will make sure input is safe for description, titles, etc in a podcast RSS feed?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-15T14:22:50+00:00

You can either:

Use a CDATA block instead (just make sure you’re using the correct encoding, i.e., the encoding of the XML file matches the encoding of the data). The only think you have to lookout for is ]]>, which cannot be put literally in a CDATA block.
Use mb_encode_numericentity instead of htmlentities (possibly combined with htmlspecialchars and a previous decoding of html entites with mb_convert_encoding).

If the encoding of the XML file is UTF-8, you can just remove the entities. Suppose you have the following HTML fragment:

&copy; 2005 John Doe

Then, you could just do:

$data = "&copy; 2005 John Doe";
$data = mb_convert_encoding($data, "UTF-8", "HTML-ENTITIES");
$data = htmlspecialchars($data, ENT_NOQUOTES, "UTF-8");

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I am reading the documentation for creating a podcast feed suitable for iTunes ,

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply