I’ve tried the code below on both Windows (64bit) and Linux(32bit).
I was sure that without BufferedOutputStream the code is bound to throw OutOfMemoryException yet it didn’t.
Why is that? Who is doing the {caching / buffer / steaming} to disk there?
Can you please describe, if relevant to the answer, the full flow (Java API -> system call) ?
Does this code uses NIO?
/Me confused.
import java.io.DataOutputStream;
import java.io.FileOutputStream;
import java.io.IOException;
public class WriteHugeFileToDisk {
private static int BYTE = 1;
private static int KILBYTE = BYTE * 1024;
private static int MEGABYTE = KILBYTE * 1024;
private static int GIGABYTE = MEGABYTE * 1024;
private static long TERABYTE = GIGABYTE * 1024L;
public static void main(String[] args) throws IOException {
FileOutputStream fileOutputStream = new FileOutputStream(args[0]);
DataOutputStream dataOutputStream = new DataOutputStream(fileOutputStream);
byte[] buffer = new byte[MEGABYTE];
for(int i = 0; i < buffer.length; i++) {
buffer[i] = (byte)i;
}
for(long l = 0; l < 4000; l++) {
dataOutputStream.write(buffer);
;
}
}
}
I’ve ran this code with Java 6. Using the following invocations:
Windows:
java WriteHugeFileToDisk %TEMP%\hi.txt
Linux:
java WriteHugeFileToDisk /mnt/hi.info
Please note: The code creates 4GB file full of just for the test.
Why would it throw an
OutOfMemoryException? It’s just writing to disk. I wouldn’t be surprised ifFileOutputStreamandDataOutputStreamhad some buffering (I haven’t checked) but they’re certainly not required to buffer everything you write.This code isn’t using NIO directly, although I wouldn’t be surprised if some of the internal stuff did. As for what system calls are involved and when – that will be implementation specific, but the important thing is to realise that neither
DataOutputStreamnorFileOutputStreamare meant to buffer everything. You write some data to them, and some of that data may get written to disk. If you flush or close the stream, that should make all the data you’ve written so far get to the disk. If you don’t flush or close the stream, I’d expect only a reasonably small amount (again, implementation-specific) to be cached, if any.Note that
BufferedOutputStreamdoes introduce caching – but only as much as you ask for (or a default). Again, it wouldn’t buffer everything unless you asked for as much buffer as you write in terms of data.