Using the latest gcc compiler, do I still have to think about these types of manual loop optimizations, or will the compiler take care of them for me well enough?
Share
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
If your profiler tells you there is a problem with a loop, and only then, a thing to watch out for is a memory reference in the loop which you know is invariant across the loop but the compiler does not. Here’s a contrived example, bubbling an element out to the end of an array:
You may know that the call to
swap_elementsdoes not change the value ofa->length, but if the definition ofswap_elementsis in another source file, it is quite likely that the compiler does not. Hence it can be worthwhile hoisting the computation ofa->lengthout of the loop:On performance-critical inner loops, my students get measurable speedups with transformations like this one.
Note that there’s no need to hoist the computation of
n-1; any optimizing compiler is perfectly capable of discovering loop-invariant computations among local variables. It’s memory references and function calls that may be more difficult. And the code withn-1is more manifestly correct.As others have noted, you have no business doing any of this until you’ve profiled and have discovered that the loop is a performance bottleneck that actually matters.