I’m pulling a file down from S3, and when I call object mapper using the content stream as a bare InputStream, decoding fails with a UTF-8 exception, but when I use a BufferedReader wrapping the InputStream, it works fine.
If I read the file down into a local file, then open it as a FileInputStream, that works fine also. I am perplexed. I’m hoping somebody has run into this before me, or has some insight around the workings of a bare InputStream versus a BufferedReader in terms of the encoding in Jackson.
This fails
S3Object s3o = s3Client.getObject("my-bucket","my-key");
Object t = om.readValue(s3o.getObjectContent(), Object.class);
This works
S3Object s3o = s3Client.getObject("my-bucket","my-key");
Object t = om.readValue(new BufferedReader(new InputStreamReader(s3o.getObjectContent())), Object.class);
with the error:
org.codehaus.jackson.JsonParseException: Invalid UTF-8 middle byte 0x5c
at [Source: org.apache.http.conn.EofSensorInputStream@6460029d; line: 1, column: 31611]
at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1433)
at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:521)
at org.codehaus.jackson.impl.Utf8StreamParser._reportInvalidOther(Utf8StreamParser.java:2830)
at org.codehaus.jackson.impl.Utf8StreamParser._reportInvalidOther(Utf8StreamParser.java:2837)
at org.codehaus.jackson.impl.Utf8StreamParser._decodeUtf8_2(Utf8StreamParser.java:2625)
at org.codehaus.jackson.impl.Utf8StreamParser._finishString2(Utf8StreamParser.java:1952)
at org.codehaus.jackson.impl.Utf8StreamParser._finishString(Utf8StreamParser.java:1905)
at org.codehaus.jackson.impl.Utf8StreamParser.getText(Utf8StreamParser.java:276)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:59)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapObject(UntypedObjectDeserializer.java:218)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:47)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapArray(UntypedObjectDeserializer.java:165)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:51)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapObject(UntypedObjectDeserializer.java:218)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:47)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.mapObject(UntypedObjectDeserializer.java:196)
at org.codehaus.jackson.map.deser.std.UntypedObjectDeserializer.deserialize(UntypedObjectDeserializer.java:47)
at org.codehaus.jackson.map.ObjectMapper._readMapAndClose(ObjectMapper.java:2732)
at org.codehaus.jackson.map.ObjectMapper.readValue(ObjectMapper.java:1909)
Your content is not UTF-8, but something that is not valid for JSON like ISO-8859-1 (Latin-1). Your use of
BufferedReaderis bit wrong too — you should specify encoding, otherwise platform-default encoding (which could be anything) is used — but it probably converts from that encoding to avoid the error.Nonetheless, it sounds like content is not valid JSON and whoever produces it should fix it to use one of supported encoding (UTF-8 or UTF-16).