I am having this encoding issue in java, one string I actually need to handle is the response from running “systeminfo” command under windows commandline, and I need to present the result in a html document. The problem is if I run my application on French operating system, the garbled characters are shown in the html, no matter how I tried to convert the encodeing settings.
From the log, I can see the system encoding is “Cp1252”, code snippet is as follows:
String systemEncoding = System.getProperty("sun.jnu.encoding");
log.info("sun.jnu.encoding="+systemEncoding);
In html builder class, I did something like this:
for(String line : lines){
line = new String(line.getBytes("Cp1252"), "UTF8");
osReport.append(line + "<br>");
}
Unfortunately, I still can see those garbled “question marks” all around, which are supposed to be some French characters..
The html header looks like this btw
<HEAD>
<META content="text/html; charset=UTF-8" http-equiv=Content-Type>
</HEAD>
How to get the response string, see the following piece of code please..
try{
String systemEncoding = System.getProperty("sun.jnu.encoding");
log.info("sun.jnu.encoding="+systemEncoding);
InputStreamReader isr;
if (StringUtil.isEmpty(systemEncoding)) {
isr = new InputStreamReader(is);
} else {
isr = new InputStreamReader(is, systemEncoding);
}
BufferedReader br = new BufferedReader(isr);
String line=null;
while ((line = br.readLine()) != null) {
res.append(line);
res.append(LINE_SEP);
}
} catch (IOException ioe) {
log.error("IOException occurred while printing the response",ioe);
}
Any help?? Thanks so very much!
I am assuming you are invoking the command via the
Processtype. I would expectsysteminfo.exeto write output using the default ANSI encoding (windows-1252 on a French system.)That means that you can use the default encoding to read the input (the one used by the
InputStreamReader(InputStream)constructor.) This will transcode the input from the default encoding to UTF-16. This code uses theScannertype with the default system encoding:Java strings are always UTF-16, so code like this is just a transcoding bug:
Ensure that you are encoding your HTML file correctly.
I would not try to read or directly change system properties like
sun.jnu.encodingorfile.encoding– these are JVM implementation details and their direct use or configuration is not supported.If you are relying on
System.outto verify characters, ensure the device consuming the output decodes its input as windows-1252. See here for more on encoding.