Somebody reports me that the program I gave him that uses log4j doesn’t correctly print characters. He tells me that “é” gets printed in the file as “é” (for example: “Vidéo” becomes “Vidéo”).
It’s probably some encoding issue, but I like to reproduce problems to prove that it’s fixed.
I was unable to find good (and short) documentation on the subject so:
- What causes this problem (and how does log4j chose the encoding?)?
- Can it be fixed by simply using “log4j.appender.myappender.encoding=UTF-8” ?
Thank you for the help!
WriterAppender(which is the base class forFileAppenderand its variants. Has asetEcodingmethod. So yes: usinglog4j.appender.myappender.encoding=UTF-8should simply work.Note, however, that “Vidéo” becoming “Vidéo” looks like it is writing UTF-8, but whatever you use to view the file interprets it as some other encoding (usually that’s some ISO-8859-* encoding or one of the ISO-derivatives).
Ãis U+00C3 and©is U+00A9. They are encoded as 0xC3 and 0xA9 in ISO-8859-1.éis U+00E9 which is encoded as 0xC3 0xA9 in UTF-8.