I’ve got some unknown content coming in as a description, maybe something like this:
<description>
<p>
<span>
<font>Hello</font>
</span>
World!
<a href="/index">Home</a>
</p>
</description>
There could conceivable be any HTML tag. I don’t want all the tags. The tags I want to allow are p, i, em, strong, b, ol, ul, li and a. So, for example, <font> would be stripped, but <p> and <a> would remain. I’m assuming I have to match the ones I want (and make sure there’s nothing to match the others), but can’t work out how to do it.
Any help?
Whitelist those elements:
Note that this removes the undesired elements and anything below them. To just strip the
fontelement itself, for example, but allow its children, modify the last template like this:An equivalent (and slightly cleaner) solution:
The opposite approach is to blacklist the unwanted elements:
Again, add an
apply-templatesto the final template if you want to allow children of the skipped elements.