I work mainly on Windows and Windows CE based systems, where CreateFile, ReadFile and WriteFile are the work horses, no matter if I’m in native Win32 land or in managed .Net land.
I have so far never had any obvious problem writing or reading big files in one chunk, as opposed to looping until several smaller chunks are processed. I usually delegate the IO work to a background thread that notifies me when it’s done.
But looking at file IO tutorials or “textbook examples”, I often find the “loop with small chunks” used without any explanation of why it’s used instead of the more obvious (I dare to say!) “do it all at once”.
Are there any drawbacks to the way I do that I haven’t understood?
Clarification:
By big file I compared my single chunk with the multiple chunks. The multiple chunks examples I mentioned often have chunk sizes in the order 1024 bytes on Windows CE and 10 times it on the desktop. My big files are usually binary files like camera photos from mobile phones etc. and as such in the size order 2-10 MB. Not close to 1 GB, in other words.
In general, you shouldn’t assume that a stream will read all the data in one go. While for local files it may be true, it may well not work for network files… and it definitely won’t work for general network streams unless a higher level has already buffered them.
Then there’s the matter of memory: suppose someone asks you to process a 3GB file. If you stream it, processing a chunk at a time, you’ve got no problems. If you try to read the whole thing into memory, you’re unlikely to succeed…
In general: if you can stream it, do. Why would you want to use a less reliable and less efficient approach? For any sort of robustness you’d still have to check the return value of
Readand compare it with how much you expected to read… so adding a loop doesn’t cause very much complexity. Also, if you find yourself doing this a lot you may well spot patterns which you could encapsulate into helper methods, quite possibly taking delegates to represent the custom actions being taken for processing.