I’m trying to parse XML file which contains Hebrew chars.
I know that the file is correct because if I output the file (from a different software) without the hebrew chars, it parses just fine.
I tried many things, but I always get this error
MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
My latest attempt was to open it using FileInputStream and specify the encoding
DocumentBuilder db = dbf.newDocumentBuilder();
document = db.parse(new FileInputStream(new File(xmlFileName)), "Cp1252");
(Cp1252 is an encoding that worked for me in a different app)
But I got the same result.
Tried using ByteArray as well, nothing worked.
Any suggestions?
if you know the correct encoding of the file and it’s not “utf-8”, then you can either add it to the xml header:
or parse it as a Reader: