I want to read a file into a RichTextBox without using LoadFile (I might want to display the progress). The file contains only ASCII characters.
I was thinking of reading the file in chunks.
I have done the following (which is working):
const int READ_BUFFER_SIZE = 4 * 1024;
BinaryReader reader = new BinaryReader(File.Open("file.txt", FileMode.Open));
byte[] buf = new byte[READ_BUFFER_SIZE];
do {
int ret = reader.Read(buf, 0, READ_BUFFER_SIZE);
if (ret <= 0) {
break;
}
string text = Encoding.ASCII.GetString(buf);
richTextBox.AppendText(text);
} while (true);
My concern is:
string text = Encoding.ASCII.GetString(buf);
I have seen that it is not possible to add a byte[] to a RichTextBox.
My questions are:
-
Will a new string object be allocated for every chunk which is read?
-
Isn’t there a better way not to have to create a string object just for appending the text to the RichTextBox?
-
Or, is it more efficient to read lines from the file (StreamReader.ReadLine) and just add to the RichTextBox the string returned?
Yes.
No, AppendText requires a string
No, that’s considerably less efficient. You’ll now create a new string object much more frequently. Which is okay from the garbage collected heap perspective, you don’t create more garbage. But it is absolute murder on the RichTextBox, it constantly needs to re-allocate its own buffer. Which includes moving all the text previously read. What you have is already good, you should just use a much larger READ_BUFFER_SIZE.
Unfortunately there are conflicting goals here. You don’t want to make the buffer larger than 39,999 bytes or the strings end up in the Large Object Heap and clog it up until a gen# 2 garbage collection happens. But the RTB will be much happier if you go considerably past that size, like a megabyte if the file is so large that you need a progress bar.
If you want to make it really efficient then you need to replace RichTextBox.LoadFile(). The underlying Windows message is EM_STREAMIN, it uses a callback mechanism to stream in the text. You can technically replace the callback to do what the default one does in RichTextBox, plus update a progress bar. It does permit getting rid of the strings btw. The pinvoke is pretty unfriendly, use the Reference Source for guidance.
Take the easy route first, increase the buffer size. Only consider using the pinvoke route when your code is considerably slower than using File.ReadAllText().