Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • Home
  • SEARCH
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 6582921
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: May 25, 20262026-05-25T16:20:32+00:00 2026-05-25T16:20:32+00:00

I have two logically equivalent functions: long ipow1(int base, int exp) { // HISTORICAL

  • 0

I have two logically equivalent functions:

long ipow1(int base, int exp) {
    // HISTORICAL NOTE:
    // This wasn't here in the original question, I edited it in,
    if (exp == 0) return 1;

    long result = 1;

    while (exp > 1) {
        if (exp & 1) result *= base;
        exp >>= 1;
        base *= base;
    }

    return result * base;
}

long ipow2(int base, int exp) { 
    long result = 1;

    while (exp) {
        if (exp & 1) result *= base;
        exp >>= 1;
        base *= base;
    }

    return result;
}

NOTICE:

These loops are equivalent because in the former case we are returning result * base (handling the case when exp is or has been reduced to 1) but in the second case we are returning result.


Strangely enough, both with -O3 and -O0 ipow1 consequently outperforms ipow2 by about 25%. How is this possible?

I’m on Windows 7, x64, gcc 4.5.2 and compiling with gcc ipow.c -O0 -std=c99.

And this is my profiling code:

int main(int argc, char *argv[]) {
    LARGE_INTEGER ticksPerSecond;
    LARGE_INTEGER tick;
    LARGE_INTEGER start_ticks, end_ticks, cputime;

    double totaltime = 0;
    int repetitions = 10000;
    int rep = 0;
    int nopti = 0;

    for (rep = 0; rep < repetitions; rep++) {
        if (!QueryPerformanceFrequency(&ticksPerSecond)) printf("\tno go QueryPerformance not present");
        if (!QueryPerformanceCounter(&tick)) printf("no go counter not installed");  
        QueryPerformanceCounter(&start_ticks); 

        /* start real code */

        for (int i = 0; i < 55; i++) {
            for (int j = 0; j < 11; j++) {
                nopti = ipow1(i, j); // or ipow2
            }
        }

        /* end code */

        QueryPerformanceCounter(&end_ticks); 
        cputime.QuadPart = end_ticks.QuadPart - start_ticks.QuadPart;
        totaltime += (double)cputime.QuadPart / (double)ticksPerSecond.QuadPart;
    }   

    printf("\tTotal elapsed CPU time:   %.9f  sec  with %d repetitions - %ld:\n", totaltime, repetitions, nopti);

    return 0;
}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-05-25T16:20:33+00:00Added an answer on May 25, 2026 at 4:20 pm

    If you dont want to read all of this skip to the bottom, I come up with a 21% difference just by analysis of the code.

    Different systems, versions of the compiler, same compiler version built by different folks/distros will give different instruction mixes, this is just one example of what you might get.

    long ipow1(int base, int exp) {
        long result = 1;
    
        while (exp > 1) {
            if (exp & 1) result *= base;
            exp >>= 1;
            base *= base;
        }
    
        return result * base;
    }
    
    long ipow2(int base, int exp) {
        long result = 1;
    
        while (exp) {
            if (exp & 1) result *= base;
            exp >>= 1;
            base *= base;
        }
    
        return result;
    }
    
    0000000000000000 <ipow1>:
       0:   83 fe 01                cmp    $0x1,%esi
       3:   ba 01 00 00 00          mov    $0x1,%edx
       8:   7e 1d                   jle    27 <ipow1+0x27>
       a:   66 0f 1f 44 00 00       nopw   0x0(%rax,%rax,1)
      10:   40 f6 c6 01             test   $0x1,%sil
      14:   74 07                   je     1d <ipow1+0x1d>
      16:   48 63 c7                movslq %edi,%rax
      19:   48 0f af d0             imul   %rax,%rdx
      1d:   d1 fe                   sar    %esi
      1f:   0f af ff                imul   %edi,%edi
      22:   83 fe 01                cmp    $0x1,%esi
      25:   7f e9                   jg     10 <ipow1+0x10>
      27:   48 63 c7                movslq %edi,%rax
      2a:   48 0f af c2             imul   %rdx,%rax
      2e:   c3                      retq   
      2f:   90                      nop
    
    0000000000000030 <ipow2>:
      30:   85 f6                   test   %esi,%esi
      32:   b8 01 00 00 00          mov    $0x1,%eax
      37:   75 0a                   jne    43 <ipow2+0x13>
      39:   eb 19                   jmp    54 <ipow2+0x24>
      3b:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
      40:   0f af ff                imul   %edi,%edi
      43:   40 f6 c6 01             test   $0x1,%sil
      47:   74 07                   je     50 <ipow2+0x20>
      49:   48 63 d7                movslq %edi,%rdx
      4c:   48 0f af c2             imul   %rdx,%rax
      50:   d1 fe                   sar    %esi
      52:   75 ec                   jne    40 <ipow2+0x10>
      54:   f3 c3                   repz retq 
    

    Isolating the loops:

        while (exp > 1) {
            if (exp & 1) result *= base;
            exp >>= 1;
            base *= base;
        }
    
    
    //if exp & 1 not true jump to 1d to skip   
      10:   40 f6 c6 01             test   $0x1,%sil
      14:   74 07                   je     1d <ipow1+0x1d>
    //result *= base  
      16:   48 63 c7                movslq %edi,%rax
      19:   48 0f af d0             imul   %rax,%rdx
    //exp>>=1  
      1d:   d1 fe                   sar    %esi
    //base *= base  
      1f:   0f af ff                imul   %edi,%edi
    //while(exp>1) stayin the loop  
      22:   83 fe 01                cmp    $0x1,%esi
      25:   7f e9                   jg     10 <ipow1+0x10>
    

    Comparing something to zero normally saves you an instruction and you can see that here

        while (exp) {
            if (exp & 1) result *= base;
            exp >>= 1;
            base *= base;
        }
    
    
    //base *= base  
      40:   0f af ff                imul   %edi,%edi
    //if exp & 1 not true jump to skip  
      43:   40 f6 c6 01             test   $0x1,%sil
      47:   74 07                   je     50 <ipow2+0x20>
    //result *= base  
      49:   48 63 d7                movslq %edi,%rdx
      4c:   48 0f af c2             imul   %rdx,%rax
    //exp>>=1  
      50:   d1 fe                   sar    %esi
    //no need for a compare  
      52:   75 ec                   jne    40 <ipow2+0x10>
    

    Your timing method is going to generate a lot of error/chaos. Depending on the beat frequency of the loop and the accuracy of the timer you can create a lot of gain in one and a lot of loss in another. This method normally gives better accuracy:

    starttime = …
    for(rep=bignumber;rep;rep–)
    {
    //code under test
    …
    }
    endtime = …
    total = endtime – starttime;

    Of course if you are running this on an operating system timing it is going to have a decent amount of error in it anyway.

    Also you want to use volatile variables for your timer variables, helps the compiler to not re-arrange the order of execution. (been there seen that).

    If we look at this from the perspective of the base multiplies:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    unsigned int mults;
    
    long ipow1(int base, int exp) {
        long result = 1;
    
        while (exp > 1) {
            if (exp & 1) result *= base;
            exp >>= 1;
            base *= base;
            mults++;
        }
    
        result *= base;
    
        return result;
    }
    
    long ipow2(int base, int exp) {
        long result = 1;
    
        while (exp) {
            if (exp & 1) result *= base;
            exp >>= 1;
            base *= base;
            mults++;
        }
    
        return result;
    }
    
    
    int main ( void )
    {
        int i;
        int j;
    
        mults = 0;
            for (i = 0; i < 55; i++) {
                for (j = 0; j < 11; j++) {
                    ipow1(i, j); // or ipow2
                }
            }
        printf("mults %u\n",mults);
    
        mults=0;
    
            for (i = 0; i < 55; i++) {
                for (j = 0; j < 11; j++) {
                    ipow2(i, j); // or ipow2
                }
            }
        printf("mults %u\n",mults);
    
    }
    

    there are

    mults 1045
    mults 1595
    

    50% more for ipow2(). Actually it is not just the multiplies it is that you are going through the loop 50% more times.

    ipow1() gets a little back on the other multiplies:

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    unsigned int mults;
    
    long ipow1(int base, int exp) {
        long result = 1;
    
        while (exp > 1) {
            if (exp & 1) mults++;
            exp >>= 1;
            base *= base;
        }
        mults++;
    
        return result;
    }
    
    long ipow2(int base, int exp) {
        long result = 1;
    
        while (exp) {
            if (exp & 1) mults++;
            exp >>= 1;
            base *= base;
        }
    
        return result;
    }
    
    
    int main ( void )
    {
        int i;
        int j;
    
        mults = 0;
            for (i = 0; i < 55; i++) {
                for (j = 0; j < 11; j++) {
                    ipow1(i, j); // or ipow2
                }
            }
        printf("mults %u\n",mults);
    
        mults=0;
            for (i = 0; i < 55; i++) {
                for (j = 0; j < 11; j++) {
                    ipow2(i, j); // or ipow2
                }
            }
        printf("mults %u\n",mults);
    
    }
    

    ipow1() performs the result*=base a different number (more) times than ipow2()

    mults 990
    mults 935
    

    being a long * int can make these more expensive. not enough to make up for the losses around the loop in ipow2().

    Even without disassembling, making a rough guess on the operations/instructions you hope the compiler uses. Accounting here for processors in general not necessarily x86, some processors will run this code better than others (from a number of instructions executed perspective not counting all the other factors).

    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    unsigned int ops;
    
    long ipow1(int base, int exp) {
        long result = 1;
        ops++; //result = immediate
        while (exp > 1) {
            ops++; // compare exp - 1
            ops++; // conditional jump
                //if (exp & 1)
            ops++; //exp&1
            ops++; //conditional jump
            if (exp & 1)
            {
                result *= base;
                ops++;
            }
            exp >>= 1;
            ops++;
            //ops+=?; //using a signed number can cost you this on some systems
            //always use unsigned unless you have a specific reason to use signed.
            //if this had been a short or char variable it might cost you even more
            //operations
            //if this needs to be signed it is what it is, just be aware of
            //the cost
            base *= base;
            ops++;
        }
        result *= base;
        ops++;
        return result;
    }
    
    long ipow2(int base, int exp) {
        long result = 1;
        ops++;
        while (exp) {
            //ops++; //cmp exp-0, often optimizes out;
            ops++; //conditional jump
            //if (exp & 1)
            ops++;
            ops++;
            if (exp & 1)
            {
                result *= base;
                ops++;
            }
            exp >>= 1;
            ops++;
            //ops+=?; //right shifting a signed number
            base *= base;
            ops++;
        }
        return result;
    }
    
    
    
    int main ( void )
    {
        int i;
        int j;
    
        ops = 0;
            for (i = 0; i < 55; i++) {
                for (j = 0; j < 11; j++) {
                    ipow1(i, j); // or ipow2
                }
            }
        printf("ops %u\n",ops);
    
        ops=0;
            for (i = 0; i < 55; i++) {
                for (j = 0; j < 11; j++) {
                    ipow2(i, j); // or ipow2
                }
            }
        printf("ops %u\n",ops);
    
    }
    

    Assuming I counted all the major operations and didnt unfairly give one function more than another:

    ops 7865
    ops 9515
    

    ipow2 is 21% slower using this analysis.

    I think the big killer is the 50% more times through the loop. Granted it is data dependent, you might find inputs in a benchmark test that make the difference between functions greater or worse than the 25% you are seeing.

    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

Imagine that we have two tables as follows: Trades ( TradeRef INT NOT NULL,
I have two WCF services which are logically different and I wish to keep
i'm just trying to create this little simulator. in a gui, i have two
I have a model that for edit/update actions only is logically split into two
I have two SVG rects, where one is logically on top of the other
Assuming you have two developers working on a project locally on their laptops (A
Have two folders with approx. 150 java property files. In a shell script, how
I have two applications written in Java that communicate with each other using XML
I have two arrays of animals (for example). $array = array( array( 'id' =>
I have two arrays of System.Data.DataRow objects which I want to compare. The rows

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.