Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7770283
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 1, 20262026-06-01T16:22:20+00:00 2026-06-01T16:22:20+00:00

I am learning x86 assembler in order to write a compiler. In particular, I’m

  • 0

I am learning x86 assembler in order to write a compiler. In particular, I’m taking a variety of simple recursive functions and feeding them through different compilers (OCaml, GCC etc.) in order to get a better understanding of the kinds of assembler generated by different compilers.

I’ve got a trivial recursive integer Fibonacci function:

int fib(int x) { return (x < 2 ? x : fib(x-1)+fib(x-2)); }

My hand-coded assembly looks like this:

fib:
    cmp eax, 2
    jl  fin
    push    eax
    dec eax
    call    fib
    push    eax
    mov eax, [esp+4]
    add eax, -2
    call    fib
    add eax, [esp]
    add esp, 8
fin:
    ret

Compiling that function to Intel-syntax assembler using gcc -O2 produces this enigmatic code:

_fib:
    push    edi
    push    esi
    push    ebx
    sub esp, 16
    mov edi, DWORD PTR [esp+32]
    cmp edi, 1
    jle L4
    mov ebx, edi
    xor esi, esi
L3:
    lea eax, [ebx-1]
    mov DWORD PTR [esp], eax
    call    _fib
    sub ebx, 2
    add esi, eax
    cmp ebx, 1
    jg  L3
    and edi, 1
L2:
    lea eax, [esi+edi]
    add esp, 16
    pop ebx
    pop esi
    pop edi
    ret
L4:
    xor esi, esi
    jmp L2

So I guess the calling convention is argument at [esp+4] and return value in eax. It starts by pushing edi, esi and ebx. Then it claims another 16 bytes for a stack frame, enough for 4 temporary ints. Then edi is read from [esp+32], which is the argument. If the argument is <=1 then it jumps to L4 which zeroes out (?) esi before jumping back to L2 which sets eax=esi+edi which is just the argument edi. If the argument was >1 then the argument is copied into ebx and zeroes esi before falling through into L3. In L3, it sets eax=ebx-1 and stores the result (n-1) at esp in the stack frame before recursing to calculate fib(n-1). The result is added to esi, ebx is set to n-2 and it loops back to L3 if ebx>1 otherwise it extracts the lower bit of edi before falling through to L2.

Why is this code so convoluted (e.g. is there a name for an optimization that has been done that I’m not seeing?)?

The recursive calls fib(n-2) seem to have been replaced with a loop accumulating in esi but that call wasn’t in tail position so how was this done?

What is the purpose of the and edi, 1?

What is the purpose of the mov DWORD PTR [esp], eax?

Why is the stack frame so large?

Can you disassemble this algorithm back into C to make it clearer what is going on?

My preliminary impression is that GCC generates pretty poor x86 assembler. In this case, over 2× more code for equal performance (3.25s for fib(40) on this 1.6GHz Atom for both programs). Is that fair?

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-01T16:22:22+00:00Added an answer on June 1, 2026 at 4:22 pm

    In addition to the comments above, note that the recursion has been unwound into a tail call by replacing:

    return x < 2 ? x : fib(x - 2) + fib(x - 1);
    

    with:

    if ((xprime = x) < 2) {
        acc = 0;
    } else {
        /* at this point we know x >= 2 */
        acc = 0; /* start with 0 */
        while (x > 1) {
           acc += fib(x - 1); /* add fib(x-1) */
           x -= 2; /* now we'll add fib(x-2) */
        }
        /* so at this point we know either x==1 or x==0 */
        xprime = x == 1 ? 1 : 0; /* ie, x & 1 */
    }
    return xprime + acc;
    

    I suspect this rather tricky loop arose from multiple optimization steps, not that I have fiddled with gcc optimization since about gcc 2.3 (it’s all very different inside now!).

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I'm writing a JIT compiler with an x86 backend and learning x86 assembler and
Whilst learning the assembler language (in linux on a x86 architecture using the GNU
Years ago I was learning about x86 assembler, CPU pipelining, cache misses, branch prediction,
I would like to learn the x86 Instruction Set Architecture. I don't meaning learning
I've been learning compiler theory and assembly and have managed to create a compiler
I've just begun learning some x86 assembly on win32, and I've used masm with
I want to try write a simple kernel in C# like cosmos, just for
I'm learning the AT&T syntax for the Intel x86 architecture, and I'm a bit
does anyone have any resources for learning assembly language on x86? I'm trying to
Currently I am in the midst of learning x86 assembly for fun, I'm love

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.