I’m doing some experimenting with x86-64 assembly. Having compiled this dummy function:
long myfunc(long a, long b, long c, long d,
long e, long f, long g, long h)
{
long xx = a * b * c * d * e * f * g * h;
long yy = a + b + c + d + e + f + g + h;
long zz = utilfunc(xx, yy, xx % yy);
return zz + 20;
}
With gcc -O0 -g I was surprised to find the following in the beginning of the function’s assembly:
0000000000400520 <myfunc>:
400520: 55 push rbp
400521: 48 89 e5 mov rbp,rsp
400524: 48 83 ec 50 sub rsp,0x50
400528: 48 89 7d d8 mov QWORD PTR [rbp-0x28],rdi
40052c: 48 89 75 d0 mov QWORD PTR [rbp-0x30],rsi
400530: 48 89 55 c8 mov QWORD PTR [rbp-0x38],rdx
400534: 48 89 4d c0 mov QWORD PTR [rbp-0x40],rcx
400538: 4c 89 45 b8 mov QWORD PTR [rbp-0x48],r8
40053c: 4c 89 4d b0 mov QWORD PTR [rbp-0x50],r9
400540: 48 8b 45 d8 mov rax,QWORD PTR [rbp-0x28]
400544: 48 0f af 45 d0 imul rax,QWORD PTR [rbp-0x30]
400549: 48 0f af 45 c8 imul rax,QWORD PTR [rbp-0x38]
40054e: 48 0f af 45 c0 imul rax,QWORD PTR [rbp-0x40]
400553: 48 0f af 45 b8 imul rax,QWORD PTR [rbp-0x48]
400558: 48 0f af 45 b0 imul rax,QWORD PTR [rbp-0x50]
40055d: 48 0f af 45 10 imul rax,QWORD PTR [rbp+0x10]
400562: 48 0f af 45 18 imul rax,QWORD PTR [rbp+0x18]
gcc very strangely spills all argument registers onto the stack and then takes them from memory for further operations.
This only happens on -O0 (with -O1 there are no problems), but still, why? This looks like an anti-optimization to me – why would gcc do that?
I am by no means a GCC internals expert, but I’ll give it a shot. Unfortunately most of the information on GCCs register allocation and spilling seems to be out of date (referencing files like
local-alloc.cthat don’t exist anymore).I’m looking at the source code of
gcc-4.5-20110825.In GNU C Compiler Internals it is mentioned that the initial function code is generated by
expand_function_startingcc/function.c. There we find the following for handling parameters:In
assign_parmsthe code that handles where each arguments is stored is the following:assign_parm_setup_block_phandles aggregate data types and is not applicable in this case and since the data is not passed as a pointer GCC checksuse_register_for_decl.Here the relevant part is:
DECL_REGISTERtests whether the variable was declared with theregisterkeyword. And now we have our answer: Most parameters live on the stack when optimizations are not enabled, and are then handled byassign_parm_setup_stack. The route taken through the source code before it ends up spilling the value is slightly more complicated for pointer arguments, but can be traced in the same file if you’re curious.Why does GCC spill all arguments and local variables with optimizations disabled? To help debugging. Consider this simple function:
Compiled with
gcc -O1 -cthis generates the following on my machine:Which is fine except if you break on line 5 and try to print the value of a, you get
As the argument gets overwritten since it’s not used after the call to
bar.