Is there any substantial optimization when omitting the frame pointer?
If I have understood correctly by reading this page, -fomit-frame-pointer is used when we want to avoid saving, setting up and restoring frame pointers.
Is this done only for each function call and if so, is it really worth to avoid a few instructions for every function?
Isn’t it trivial for an optimization.
What are the actual implications of using this option apart from the debugging limitations?
I compiled the following C code with and without this option
int main(void)
{
int i;
i = myf(1, 2);
}
int myf(int a, int b)
{
return a + b;
}
,
# gcc -S -fomit-frame-pointer code.c -o withoutfp.s
# gcc -S code.c -o withfp.s
.
diff -u ‘ing the two files revealed the following assembly code:
--- withfp.s 2009-12-22 00:03:59.000000000 +0000
+++ withoutfp.s 2009-12-22 00:04:17.000000000 +0000
@@ -7,17 +7,14 @@
leal 4(%esp), %ecx
andl $-16, %esp
pushl -4(%ecx)
- pushl %ebp
- movl %esp, %ebp
pushl %ecx
- subl $36, %esp
+ subl $24, %esp
movl $2, 4(%esp)
movl $1, (%esp)
call myf
- movl %eax, -8(%ebp)
- addl $36, %esp
+ movl %eax, 20(%esp)
+ addl $24, %esp
popl %ecx
- popl %ebp
leal -4(%ecx), %esp
ret
.size main, .-main
@@ -25,11 +22,8 @@
.globl myf
.type myf, @function
myf:
- pushl %ebp
- movl %esp, %ebp
- movl 12(%ebp), %eax
- addl 8(%ebp), %eax
- popl %ebp
+ movl 8(%esp), %eax
+ addl 4(%esp), %eax
ret
.size myf, .-myf
.ident "GCC: (GNU) 4.2.1 20070719
Could someone please shed light on the key points of the above code where -fomit-frame-pointer did actually make the difference?
Edit: objdump‘s output replaced with gcc -S‘s
-fomit-frame-pointerallows one extra register to be available for general-purpose use. I would assume this is really only a big deal on 32-bit x86, which is a bit starved for registers.*One would expect to see EBP no longer saved and adjusted on every function call, and probably some additional use of EBP in normal code, and fewer stack operations on occasions where EBP gets used as a general-purpose register.
Your code is far too simple to see any benefit from this sort of optimization– you’re not using enough registers. Also, you haven’t turned on the optimizer, which might be necessary to see some of these effects.
* ISA registers, not micro-architecture registers.