First piece of code:
// code is a private "global variable" for the class
// SourceCodeBuilder is a class that uses StringBuilder()
// basically it is based on String(s), formatted and with many appends depending on the "loc()" calls (see below)
private SourceCodeBuilder code = new SourceCodeBuilder();
[...]
// create "file.txt" and call algorithm
fileOut = new FileWriter("file.txt");
for (int i=0; i<x; i++) {
algorithm();
}
Where algorithm() is a method like this:
private void algorithm () {
for (int i=0; i<y; i++) {
code.loc("some text");
code.loc("other text");
...
}
// after "building" the code value I wrote it on the file
fileOut.write(code.toString());
fileOut.flush();
code.free(); // this call "empties" the code variable (so the next time algorithm() is called it has the code var sets to "" - it frees a lot of memory)
// basically it calls "setLength(0)" method of StringBuilder
}
When I do all of this on large text files it takes about 4500ms to execute and less than 60MB of memory.
Then I tried to use this other code.
Second piece of code:
private SourceCodeBuilder code = new SourceCodeBuilder();
[...]
// create "file.txt" and call algorithm
fileOut = new FileWriter("file.txt");
for (int i=0; i<x; i++) {
algorithm();
}
fileOut.write(code.toString());
fileOut.flush();
fileOut.close();
Where this time algorithm() is a method like this:
private void algorithm () {
for (int i=0; i<y; i++) {
code.loc("some text");
code.loc("other text");
...
}
}
It takes more than 250MB of memory (and it’s OK because I don’t call the “free()” method on the code variable, so it’s a “continuos” append on the same variable), but surprisingly it takes more than 5300ms to execute.
That’s about 16% slower than the first code, and I can’t explain to myself why.
In the first code I write small pieces of text multiple times on “file.txt”. In the second code I write a big piece of text, but only one time, on “file.txt”, and using more memory. With the second code I was expecting more memory consumption, but not even more CPU consumption (just because there are more I/O operations).
Conclusion: the first piece of code is faster than the second one, even if the first one does more I/O operations than the second one. Why? Am I missing something?
When you are slowly filling a large memory buffer, the time required for that grows non-linearly, because you need to re-allocate the buffer multiple times, each time copying the entire content to a new location in memory. This takes time, especially when the buffer is 200MB+. If you preallocate the buffer, your process may go faster.
However, all the above is just my guess. You should profile your application to see where the additional time really goes.