I have a mildly large function (about 80 lines of code without comments) that I’m optimizing.
As part of trying to let the profiler do the work for me, I took 2 chunks of code and put them in separate functions (this is supposed to be only temporary until I can put them back in).
The interesting part is this:
My test case took 29.8 seconds
After I put the first chunk into a separate function I saw the small performance loss due to function call overhead. (30.2 seconds)
When I put the second chunk of code into a separate function I got a pretty huge performance gain down to 24.2 seconds
The second chunk of code is an insertion into a rather large linked list which I plan to replace with a binary tree or something, but still this 20% improvement is pretty confusing to me.
tl;dr: Trying to optimize code and noticed that putting block of code into separate function gave me a 20% performance increase. How is that possible?
Edit: confirmed running in release build as well
By extracting this block of code you made the function simpler. Maybe that helped the compiler to efficiently compile the function. It might have relieved register pressure because there are less local variables.
Sometimes, it is just coincidence. Jiggling code around randomly is very likely to change performance (in both ways). Maybe you just happened to hit an improvement instead of a deterioration.
Why does “jiggling” change performance? It might change address alignment, branch prediction, the compilers view on what is hot and what is cold, CPU instruction cache usage.
All of these things are implementation details from a semantic standpoint. Yet they influence performance. They are quite unpredictable because they work on a very low level and are very complex.