I made the following code:
try {
URL url = new URL("http://bbc.com");
is = url.openStream();
BufferedReader in = new BufferedReader(new InputStreamReader(is, "UTF-8"));
System.out.println(in.readLine());
//in.close(); with this next lines throw java.io.IOException: stream is closed
in = new BufferedReader(new InputStreamReader(is, "iso-8859-2"));
System.out.println(in.readLine().length());
} catch (Exception ex) {
ex.printStackTrace();
}
The problem is the second BufferedReader starts read from a few different point after almost every program run (the printed length is different). Same problems occur withe the same encoding. How can I read encoding and then read content with this encoding without creating new InputStream (every creation of new InputStream takes 0.1 to 3 s depending on site)?
I suggest you copy the entire stream, e.g. by repeatedly calling
read()and then writing the results into aByteArrayOutputStream. You can then get a byte array from that, and create multiple independentByteArrayInputStreamwrappers around the byte array.(You can use Guava’s
ByteStreams.ToByteArray(is)as an alternative for the first part.)Another alternative would be to wrap the original
InputStreamin aBufferedInputStream, callmarkimmediately with a “large enough” limit, then reset it after you’ve read the first line, before creating the secondBufferedReader.