Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 7973177
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 4, 20262026-06-04T08:02:25+00:00 2026-06-04T08:02:25+00:00

I have two files: #include <stdio.h> static inline void print0() { printf(Zero); } static

  • 0

I have two files:

#include <stdio.h>

static inline void print0() { printf("Zero"); }
static inline void print1() { printf("One"); }
static inline void print2() { printf("Two"); }
static inline void print3() { printf("Three"); }
static inline void print4() { printf("Four"); }

int main()
{
    unsigned int input;
    scanf("%u", &input);

    switch (input)
    {
        case 0: print0(); break;
        case 1: print1(); break;
        case 2: print2(); break;
        case 3: print3(); break;
        case 4: print4(); break;
    }
    return 0;
}

and

#include <stdio.h>

static inline void print0() { printf("Zero"); }
static inline void print1() { printf("One"); }
static inline void print2() { printf("Two"); }
static inline void print3() { printf("Three"); }
static inline void print4() { printf("Four"); }

int main()
{
    unsigned int input;
    scanf("%u", &input);

    static void (*jt[])() = { print0, print1, print2, print3, print4 };
    jt[input]();
    return 0;
}

I expected them to be compiled to almost identical assembly code. In both cases jump tables are generated, but the calls in the first file are represented by jmp, while the calls in the second one by call. Why doesn’t compiler optimise calls? Is is possible to hint gcc that I would like to see jmps instead of calls?

Compiled with gcc -Wall -Winline -O3 -S -masm=intel, GCC version 4.6.2. GCC 4.8.0 produces slightly less code, but the problem still persists.

UPD: Defining jt as const void (* const jt[])() = { print0, print1, print2, print3, print4 }; and making the functions static const inline didn’t help: http://ideone.com/97SU0

  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-04T08:02:29+00:00Added an answer on June 4, 2026 at 8:02 am

    The first case (through the switch()) creates the following for me (Linux x86_64 / gcc 4.4):

      400570:       ff 24 c5 b8 06 40 00    jmpq   *0x4006b8(,%rax,8)
    [ ... ]
      400580:       31 c0                   xor    %eax,%eax
      400582:       e8 e1 fe ff ff          callq  400468 <printf@plt>
      400587:       31 c0                   xor    %eax,%eax
      400589:       48 83 c4 08             add    $0x8,%rsp
      40058d:       c3                      retq
      40058e:       bf a4 06 40 00          mov    $0x4006a4,%edi
      400593:       eb eb                   jmp    400580 <main+0x30>
      400595:       bf a9 06 40 00          mov    $0x4006a9,%edi
      40059a:       eb e4                   jmp    400580 <main+0x30>
      40059c:       bf ad 06 40 00          mov    $0x4006ad,%edi
      4005a1:       eb dd                   jmp    400580 <main+0x30>
      4005a3:       bf b1 06 40 00          mov    $0x4006b1,%edi
      4005a8:       eb d6                   jmp    400580 <main+0x30>
    [ ... ]
    Contents of section .rodata:
    [ ... ]
     4006b8 8e054000 p ... ]
    

    Note the .rodata contents @4006b8 are printed network byte order (for whatever reason …), the value is 40058e which is within main above – where the arg-initializer/jmp block starts. All the mov/jmp pairs in there use eight bytes, hence the (,%rax,8) indirection. In this case, the sequence is therefore:

    jmp <to location that sets arg for printf()>
    ...
    jmp <back to common location for the printf() invocation>
    ...
    call <printf>
    ...
    retq
    

    This means the compiler has actually optimized out the static call sites – and instead merged them all into a single, inlined printf() call. The table use here is the jmp ...(,%rax,8) instruction, and the table contained within the program code.

    The second one (with the explicitly-created table) does the following for me:

    0000000000400550 <print0>:
    [ ... ]
    0000000000400560 <print1>:
    [ ... ]
    0000000000400570 <print2>:
    [ ... ]
    0000000000400580 <print3>:
    [ ... ]
    0000000000400590 <print4>:
    [ ... ]
    00000000004005a0 <main>:
      4005a0:       48 83 ec 08             sub    $0x8,%rsp
      4005a4:       bf d4 06 40 00          mov    $0x4006d4,%edi
      4005a9:       31 c0                   xor    %eax,%eax
      4005ab:       48 8d 74 24 04          lea    0x4(%rsp),%rsi
      4005b0:       e8 c3 fe ff ff          callq  400478 <scanf@plt>
      4005b5:       8b 54 24 04             mov    0x4(%rsp),%edx
      4005b9:       31 c0                   xor    %eax,%eax
      4005bb:       ff 14 d5 60 0a 50 00    callq  *0x500a60(,%rdx,8)
      4005c2:       31 c0                   xor    %eax,%eax
      4005c4:       48 83 c4 08             add    $0x8,%rsp
      4005c8:       c3                      retq
    [ ... ]
     500a60 50054000 00000000 60054000 00000000  P.@.....`.@.....
     500a70 70054000 00000000 80054000 00000000  p.@.......@.....
     500a80 90054000 00000000                    ..@.....
    

    Again, note the inverted byte order as objdump prints the data section – if you turn these around you get the function adresses for print[0-4]().

    The compiler is invoking the target through an indirect call – i.e. the table usage is directly in the call instruction, and the table has _explicitly been created as data.

    Edit:
    If you change the source like this:

    #include <stdio.h>
    
    static inline void print0() { printf("Zero"); }
    static inline void print1() { printf("One"); }
    static inline void print2() { printf("Two"); }
    static inline void print3() { printf("Three"); }
    static inline void print4() { printf("Four"); }
    
    void main(int argc, char **argv)
    {
        static void (*jt[])() = { print0, print1, print2, print3, print4 };
        return jt[argc]();
    }

    the created assembly for main() becomes:

    0000000000400550 <main>:
      400550:       48 63 ff                movslq %edi,%rdi
      400553:       31 c0                   xor    %eax,%eax
      400555:       4c 8b 1c fd e0 09 50    mov    0x5009e0(,%rdi,8),%r11
      40055c:       00
      40055d:       41 ff e3                jmpq   *%r11d
    

    which looks more like what you wanted ?

    The reason for this is that you need “stackless” funcs to be able to do this – tail-recursion (returning from a function via jmp instead of ret) is only possible if you either have done all stack cleanup already, or don’t have to do any because you have nothing to clean up on the stack. The compiler can (but needs not) choose to clean up before the last function call (in which case the last call can be made by jmp), but that’s only possible if you return either the value you got from that function, or if you “return void“. And, as said, if you actually use stack (like your example does for the input variable) there’s nothing that can make the compiler force to undo this in such a way that tail-recursion results.

    Edit2:

    The disassembly for the first example, with the same changes (argc instead of input and forcing void main – no standard-conformance comments please this is a demo), results in the following assembly:

    0000000000400500 <main>:
      400500:       83 ff 04                cmp    $0x4,%edi
      400503:       77 0b                   ja     400510 <main+0x10>
      400505:       89 f8                   mov    %edi,%eax
      400507:       ff 24 c5 58 06 40 00    jmpq   *0x400658(,%rax,8)
      40050e:       66                      data16
      40050f:       90                      nop
      400510:       f3 c3                   repz retq
      400512:       bf 3c 06 40 00          mov    $0x40063c,%edi
      400517:       31 c0                   xor    %eax,%eax
      400519:       e9 0a ff ff ff          jmpq   400428 <printf@plt>
      40051e:       bf 41 06 40 00          mov    $0x400641,%edi
      400523:       31 c0                   xor    %eax,%eax
      400525:       e9 fe fe ff ff          jmpq   400428 <printf@plt>
      40052a:       bf 46 06 40 00          mov    $0x400646,%edi
      40052f:       31 c0                   xor    %eax,%eax
      400531:       e9 f2 fe ff ff          jmpq   400428 <printf@plt>
      400536:       bf 4a 06 40 00          mov    $0x40064a,%edi
      40053b:       31 c0                   xor    %eax,%eax
      40053d:       e9 e6 fe ff ff          jmpq   400428 <printf@plt>
      400542:       bf 4e 06 40 00          mov    $0x40064e,%edi
      400547:       31 c0                   xor    %eax,%eax
      400549:       e9 da fe ff ff          jmpq   400428 <printf@plt>
      40054e:       90                      nop
      40054f:       90                      nop
    

    This is worse in one way (does two jmp instead of one) but better in another (because it eliminates the static functions and inlines the code). Optimization-wise, the compiler has pretty much done the same thing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

I have two files: hudwidgets.c #include osd.h #include osdadd.h #include <stdio.h> struct TextSize tsize;
I have two files. file1 has the data like belowing containing only one column.
I have two files one with senators who are retiring and one of complete
Hello ever one I have made a main function and two files one header
I have a c program #include <stdio.h> int main () { printf(Hello); } On
I have two files which both contain a list of words. Is there an
I have two files, A is a subset of B. Both A and B
i have two files:(localhost/template/) index.php template.php each time when i create an article(an article
I have two files and the content is as follows: alt text http://img144.imageshack.us/img144/4423/screencapture2b.png alt
I have two files A - nodes_to_delete and B - nodes_to_keep . Each file

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.