I’m using JAXB to generate XML that is uploaded to our Google feed. While testing and comparing this new method’s output to the output from the old way that we were doing it (Using JSPs), I noticed that single quote characters aren’t being handled correctly.
Field Content:
& ' " > <
Old Correct output:
<title> & ' " > < </title>
New Incorrect output:
<title> & ' " > < </title>
I tried replacing all single quote characters in the field with
'
before I marshall the XML, but this ends up replacing the ampersand with its character code as well as leaving me with a useless #39 sitting there after marshalling.
At which point should I try to remedy this problem? Can I get the correct behavior by altering the string in some way before passing it into the JAXB class, or is there something I must do to change how the marshalling handles single quotes?
Thanks for reading!
EDIT:
Sorry I wasn’t more clear before, Google’s documentation requires that those 5 characters are represented either by their Entity or Character Codes.
From their documentation:
Data values that are not in CDATA sections, including URLs, must use escape codes for the characters listed in the
following table. You can use either the entity code or the character code to represent these special characters.
Ampersand & & &
Single Quote ' ' '
Double Quote " " "
Greater Than > > >
Less Than < < <
I would like to avoid the CDATA route if possible.
Single quotes don’t have to be escaped. The second output is correct with regards to XML format and more concise, which is even better.
If you want fine-grained control over which characters are escaped (and how), you might try to implement your own
CharacterEscapeHandle. Never tried it, but it is documented as a feature of JSXB RI.See also: