Using java.net, java.io, what is the fastest way to parse html from online, and load it to a file or the console? Is buffered writer/buffered reader faster than inputstreamreader/outputstreamwriter? Are writers and readers faster than outputstreams and inputstreams?
I am experiencing serious lag with the following output writer/stream:
URLConnection ii;
BufferedReader iik = new BufferedReader(new InputStreamReader(ii.getInputStream()));
String op;
while(iik.readLine()!=null) {
op=iik.readLine();
System.out.println(op);
}
But curiously i am experiencing close to no lagtime with the following code:
URLConnection ii=i.openConnection();
Reader xh=new InputStreamReader(ii.getInputStream());
int r;
Writer xy=new PrintWriter(System.out);
while((r=xh.read())!=-1) {
xy.write(r);
}
xh.close();
xy.close();
What is going on here?
Readers/Writers shouldn’t be inherently faster than Input/OutputStreams.
That said, going through
readLine()andprintln()probably isn’t the optimal way of transferring bytes. In your case, if the file you’re loading doesn’t contain many newline characters, BufferedReader will have to buffer a lot of data before readLine() will return.The canonical non-terrible way of transferring data between streams is doing it in chunks by using a buffer:
It might be faster yet to use NIO, the code for it is a little less straightforward and I just use the one found in this blog post.
If you’re writing to/from a file, the best method is to use a zero-copy approach, which Java makes available with
FileChannel.transferFrom()andtransferTo(). Sample code is available in a DeveloperWorks article.