Say I have a loop that looks like this:
for(int i = 0; i < 10000; i++) {
/* Do something computationally expensive */
if (i < 200 && !(i%20)) {
/* Do something else */
}
}
wherein some trivial task gets stuck behind an if-statement that only runs a handful of times.
I’ve always heard that “if-statements in loops are slow!” So, in the hopes of (marginally) increased performance, I split the loops apart into:
for(int i = 0; i < 200; i++) {
/* Do something computationally expensive */
if (!(i%20)) {
/* Do something else */
}
}
for(int i = 200; i < 10000; i++) {
/* Do something computationally expensive */
}
Will gcc (with the appropriate flags, like -O3) automatically break the one loop into two, or does it only unroll to decrease the number of iterations?
Why not just disassemble the program and see for yourself? But here we go. This is the testprogram:
and this is the interesting part of the disassembled code compiled with gcc 4.3.3 and -o3:
So as we see, for this particular example, no it does not. We have only one loop starting at main+32 and ending at main+85. If you’ve got problems reading the assembly code ecx = i; ebx = sum.
But still your mileage may vary – who knows what heuristics are used for this particular case, so you’ll have to compile the code you’ve got in mind and see how longer/more complicated computations influence the optimizer.
Though on any modern CPU the branch predictor will do pretty good on such easy code, so you won’t see much performance losses in either case. What’s the performance loss of maybe a handful mispredictions if your computation intense code needs billions of cycles?