This question is mostly academic. I ask out of curiosity, not because this poses an actual problem for me.
Consider the following incorrect C program.
#include <signal.h>
#include <stdio.h>
static int running = 1;
void handler(int u) {
running = 0;
}
int main() {
signal(SIGTERM, handler);
while (running)
;
printf("Bye!\n");
return 0;
}
This program is incorrect because the handler interrupts the program flow, so running can be modified at any time and should therefore be declared volatile. But let’s say the programmer forgot that.
gcc 4.3.3, with the -O3 flag, compiles the loop body (after one initial check of the running flag) down to the infinite loop
.L7:
jmp .L7
which was to be expected.
Now we put something trivial inside the while loop, like:
while (running)
putchar('.');
And suddenly, gcc does not optimize the loop condition anymore! The loop body’s assembly now looks like this (again at -O3):
.L7:
movq stdout(%rip), %rsi
movl $46, %edi
call _IO_putc
movl running(%rip), %eax
testl %eax, %eax
jne .L7
We see that running is re-loaded from memory each time through the loop; it is not even cached in a register. Apparently gcc now thinks that the value of running could have changed.
So why does gcc suddenly decide that it needs to re-check the value of running in this case?
In the general case it’s difficult for a compiler to know exactly which objects a function might have access to and therefore could potentially modify. At the point where
putchar()is called, GCC doesn’t know if there might be aputchar()implementation that might be able to modifyrunningso it has to be somewhat pessimistic and assume thatrunningmight in fact have been changed.For example, there might be a
putchar()implementation later in the translation unit:Even if there’s not a
putchar()implementation in the translation unit, there could be something that might, for example, pass the address of therunningobject such thatputcharmight be able to modify it:Note that your
handler()function is globally accessible, soputchar()might callhandler()itself (directly or otherwise), which is an instance of the above situation.On the other hand, since
runningis visible only to the translational unit (beingstatic), by the time the compiler gets to the end of the file it should be able to determine that there is no opportunity forputchar()to access it (assuming that’s the case), and the compiler could go back and ‘fix up’ the pessimization in the while loop.Since
runningis static, the compiler might be able to determine that it’s not accessible from outside the translation unit and make the optimization you’re talking about. However, since it’s accessible throughhandler()andhandler()is accessible externally, the compiler can’t optimize the access away. Even if you makehandler()static, it’s accessible externally since you pass the address of it to another function.Note that in your first example, even though what I mentioned in the above paragraph is still true the compiler can optimize away the access to
runningbecause the ‘abstract machine model’ the C language is based on doesn’t take into account asynchronous activity except in very limited circumstances (one of which is thevolatilekeyword and another is signal handling, though the requirements of the signal handling aren’t strong enough to prevent the compiler being able to optimize away the access torunningin your first example).In fact, here’s something the C99 says about the abstract machine behavior in pretty much these exact circumstances:
Finally, you should note that the C99 standard also says:
So strictly speaking the
runningvariable may need to be declared as: