Through a java program I am creating a xml of stock holders. The generated xml would look like –
<?xml version="1.0" encoding="UTF-8" ?>
<urlset>
<url>
<loc>FirstName-LastName/id/</loc>
</url>
</urlset>
There are some stock holders having special characters in there name e.g. A. Pitkänen. Now, when I see xml for this stock holders it looks like –
<?xml version="1.0" encoding="UTF-8" ?>
<urlset>
<url>
<loc>/A-Pitk寥n/ELS_1005091/</loc>
</url>
</urlset>
This is making the xml invalid. Why this is happening? The java program is –
FileWriter fstream = new FileWriter("c:\stock-holders.xml");
final BufferedWriter out = new BufferedWriter(fstream);
try {
// Making Connection and query the stock holders to get the resultset
String aId = "";
String aFName = "";
String aLName = "";
out.write("<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\n");
out.write("<urlset>\n");
while (rs.next()) {
String url = "";
aFName = rs.getString(2);
if (StringUtils.isNotEmpty(aFName) ) {
aFName = aFName.trim();
url += aFName;
}
aLName = rs.getString(3);
if (StringUtils.isNotEmpty(aLName)) {
aLName = aLName.trim();
url += "-" + aFName;
}
aId = rs.getString(1);
if (StringUtils.isNotEmpty(aId)) {
aId = aId.trim();
url += "/" + aId + "/";
}
out.write("<url>\n");
out.write("<loc>" + url + "</loc>\n");
out.write("</url>\n");
out.flush();
}
out.write("</urlset>");
out.close();
}
Sicne your XML file is supposed to be written in UTF-8 encoding, you need to configure your
Writers to use that encoding rather than the system default one:Note that use of
FileWriteris not recommended for this very reason – it cannot be configured to use encoding other than the default one.Also, perhaps it would be better to use some existing API for constructing XML files (such as DOM or StAX) rather than do it by string concatenation. For example, your solution doesn’t take into account that your data may contain characters that are illegal in XML and should be escaped.