I need to convert the content of an InputStream into a String. The difficulty here is the input encoding, namely Latin-1. I tried several approaches and code snippets with String, getBytes, char[], etc. in order to get the encoding straight, but nothing seemed to work.
Finally, I came up with the working solution below. However, this code seems a little verbose to me, even for Java. So the question here is:
Is there a simpler and more elegant approach to achieve what is done here?
private String convertStreamToStringLatin1(java.io.InputStream is)
throws IOException {
String text = "";
// setup readers with Latin-1 (ISO 8859-1) encoding
BufferedReader i = new BufferedReader(new InputStreamReader(is, "8859_1"));
int numBytes;
CharBuffer buf = CharBuffer.allocate(512);
while ((numBytes = i.read(buf)) != -1) {
text += String.copyValueOf(buf.array(), 0, numBytes);
buf.clear();
}
return text;
}
Firstly, a few criticisms of the approach you’ve taken already. You shouldn’t unnecessarily use an NIO
CharBufferwhen you merely want achar[512]. You don’t need toclearthe buffer each iteration, either.You should also know that just constructing a
Stringwith those arguments will have the same effect, as the constructor too copies the data.You can use a dynamic
ByteArrayOutputStreamwhich grows an internal buffer to accommodate all the data. You can then use the entirebyte[]fromtoByteArrayto decode into aString.The advantage is that deferring decoding until the end avoids decoding fragments individually; while that may work for simple charsets like ASCII or ISO-8859-1, it will not work on multi-byte schemes like UTF-8 and UTF-16. This means it is easier to change the character encoding in the future, since the code requires no modification.