Sign Up

Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.

Have an account? Sign In

Have an account? Sign In Now

Sign In

Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.

Sign Up Here

Forgot Password?

Don't have account, Sign Up Here

Forgot Password

Lost your password? Please enter your email address. You will receive a link and will create a new password via email.

Have an account? Sign In Now

You must login to ask a question.

Forgot Password?

Need An Account, Sign Up Here

Please briefly explain why you feel this question should be reported.

Please briefly explain why you feel this answer should be reported.

Please briefly explain why you feel this user should be reported.

Sign InSign Up

The Archive Base

The Archive Base Logo The Archive Base Logo

The Archive Base Navigation

  • SEARCH
  • Home
  • About Us
  • Blog
  • Contact Us
Search
Ask A Question

Mobile menu

Close
Ask a Question
  • Home
  • Add group
  • Groups page
  • Feed
  • User Profile
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Buy Points
  • Users
  • Help
  • Buy Theme
  • SEARCH
Home/ Questions/Q 8643011
In Process

The Archive Base Latest Questions

Editorial Team
  • 0
Editorial Team
Asked: June 12, 20262026-06-12T11:55:35+00:00 2026-06-12T11:55:35+00:00

I’m newbie in C++ programming. I’m trying to see the benefits from moving all

  • 0

I’m newbie in C++ programming. I’m trying to see the benefits from moving all my MatLab software to C++. I’m doing some finite element stuff, mainly nonlinear, so one of the operations I need to perform massively is the cross product of two vectors. I’ve tested two implementations in Matlab and C++, C++ seems to be much more faster. In C++ two different implementations give different timings. I’m using Intel MKL.

Here is the code:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <iostream>
#include <mkl.h>


void vprod( double vgr[3], double vg1[3], double vg2[3]);


int main() {

    double v1[3]={1.22, 2.65, 3.65}, v2[3]={6.98, 98.159, 54.65}, vr[3];
    int LC=1000000;
    int i,j,k;
    double tiempo=0.0, tinicial;

    //------------------------------------------------------------------------
    std::cout << "INLINE METHOD: " << std::endl;

    tinicial = dsecnd();
        for (i=0; i<LC; i++){   
        vr[0] = v1[1]*v2[2]-v1[2]*v2[1]; 
        vr[1] =-(v1[0]*v2[2]-v1[2]*v2[0]);
        vr[2] = v1[0]*v2[1]-v1[1]*v2[0];
    };

    tiempo = (dsecnd() - tinicial);
    std::cout << "Tiempo Total: " << tiempo << std::endl;
    std::cout << "Resultado: " << vr[0] << std::endl;
    //------------------------------------------------------------------------

    //------------------------------------------------------------------------
    std::cout << "FUNCTION METHOD: " << std::endl;

    tinicial = dsecnd();
        for (i=0; i<LC; i++){   
        vprod (vr,v1,v2);
    };

    tiempo = (dsecnd() - tinicial);
    std::cout << "Tiempo Total: " << tiempo << std::endl;
    std::cout << "Resultado: " << vr[0] << std::endl;
    //------------------------------------------------------------------------

    std::cin.ignore();
    return 0;

}


inline void vprod( double vgr[3], double vg1[3], double vg2[3]){
    vgr[0] = vg1[1]*vg2[2]-vg1[2]*vg2[1]; 
    vgr[1] =-(vg1[0]*vg2[2]-vg1[2]*vg2[0]);
    vgr[2] = vg1[0]*vg2[1]-vg1[1]*vg2[0];

}

My question is: Why the first implementation is 3 times faster than the second? Is this the result of function call overhead? Thanks !!!

EDIT: I’ve modified the code in order to avoid the compiler “guessing” the results for the loop with constant vectors. As @phonetagger showed, the results are very different. I’ve got 28500 microseconds without using the vprod function and 29000 microseconds using the vprod function. This number were obtained using Ox optimization. Changing the optimization doesn’t affect the comparison if the inline keyword is on, although the numbers raise a bit. Also, if the inline keyword is not used (and optimization is off) the timings are 32000 without using the vprod function and 37000 using the function. So the function call overhead may be around 5000 microseconds.

The new code is:

#include <stdio.h>
#include <time.h>
#include <stdlib.h>
#include <iostream>
#include <mkl.h>

//#include <mkl_lapack.h>

void vprod( double *vgr, int ploc, double *vg1, double *vg2);


int main() {

    int nv=1000000;
    int dim=3*nv;
    double *v1, *v2, *vr; // Declare Pointers
    int ploc, i;
    double tiempo=0.0, tinicial;

     v1 = new double [dim];  //Allocate block of memory
     v2 = new double [dim];
     vr = new double [dim];

// Fill vectors with something
    for (i = 0; i < dim; i++) {
        v1[i] =1.25 +  (double)(i+1);
        v2[i] =2.62+ 2*(double)(i+7);
    }



    //------------------------------------------------------------------------
    std::cout << "RUTINA CON CODIGO INLINE: \n" ;

    tinicial = dsecnd();
    ploc = 0; // ploc points to an intermediate location.
    for (i=0; i<nv; i++){   
        vr[ploc] = v1[ploc+1]*v2[ploc+2]-v1[ploc+2]*v2[ploc+1]; 
        vr[ploc+1] =-(v1[ploc]*v2[ploc+2]-v1[ploc+2]*v2[ploc]);
        vr[ploc+2] = v1[ploc]*v2[ploc+1]-v1[ploc+1]*v2[ploc];
        ploc +=3;
    };

    tiempo = (dsecnd() - tinicial);
    std::cout << "Tiempo Total: " << tiempo << ".\n";
    std::cout << "Resultado: " << vr[0] << ".\n";

    delete v1,v2,vr;

v1 = new double [dim];  //Allocate block of memory
v2 = new double [dim];
vr = new double [dim];
    //------------------------------------------------------------------------

    //------------------------------------------------------------------------
    std::cout << "RUTINA LLAMANDO A FUNCION: \n" ;

    ploc=0;
    tinicial = dsecnd();
        for (i=0; i<nv; i++){   
        vprod ( vr, ploc, v1, v2);
        ploc +=3;
    };

    tiempo = (dsecnd() - tinicial);
    std::cout << "Tiempo Total: " << tiempo << ".\n";
    std::cout << "Resultado: " << vr[0] << ".\n";
    //------------------------------------------------------------------------

    std::cin.ignore();
    return 0;

}


inline void vprod( double *vgr, int ploc, double *vg1, double *vg2) {
        vgr[ploc]    =   vg1[ploc+1]*vg2[ploc+2]-vg1[ploc+2]*vg2[ploc+1]; 
        vgr[ploc+1]  = -(vg1[ploc]*vg2[ploc+2]-vg1[ploc+2]*vg2[ploc]);
        vgr[ploc+2]  =   vg1[ploc]*vg2[ploc+1]-vg1[ploc+1]*vg2[ploc];

}
  • 1 1 Answer
  • 0 Views
  • 0 Followers
  • 0
Share
  • Facebook
  • Report

Leave an answer
Cancel reply

You must login to add an answer.

Forgot Password?

Need An Account, Sign Up Here

1 Answer

  • Voted
  • Oldest
  • Recent
  • Random
  1. Editorial Team
    Editorial Team
    2026-06-12T11:55:36+00:00Added an answer on June 12, 2026 at 11:55 am

    Martin, you are absolutely right (ref. Martin’s comment… 3rd comment under my 17:57 Oct 5 2012 answer). Yes, it appears that at higher optimization levels, the compiler was allowing itself to realize that it knew the incoming values of your arrays so it could perform the entire computation, loop and all, at compile time, and optimize the loop out entirely.

    I re-coded the test code into three separate files (one header & two source files) and broke the computation & loop out into a separate function to keep the compiler from being too smart with its optimizations. Now it can’t optimize the loops into a compile-time computation. Below are my new results. Note that I added another loop (0 to 50) around the original 0 to 1000000 loop, and then divided by 50. I did this for two reasons: It allows us to compare today’s numbers with the previous numbers, and it also averages out irregularities due to processes swapping in the middle of the test. That may not matter to you since I think dsecnd() reports only CPU time of its specific process?

    Anyway, here are my new results…….

    (And yes, the odd result of “inline keyword, optimization -O1” being faster than -O2 or -O3 is repeatable, as is the oddity of “no inline keyword, optimization -O1”. I didn’t dig into the assembly to see why that might be.)

    //========================================================================================
    // File: so.h
    
    void loop_inline( const int LC, double vgr[3], double vg1[3], double vg2[3]);
    void loop_func( const int LC, double vgr[3], double vg1[3], double vg2[3]);
    
    //---------------------------------
    // Comment or uncomment to test both ways...
    #define USE_INLINE_KEYWORD
    //
    // Using g++ (GCC) 4.1.2 20080704 (Red Hat 4.1.2-52) on an x86 machine...
    //
    //                                 microseconds          microseconds
    //                               "hardcoded inline"   "via vprod() function"
    //                                                     [i]=inlined, [-]=not
    //                               ------------------   ----------------------
    // inline keyword
    //      no optimization                11734               14598 [-]
    //      optimization -O1                4617                4616 [i]
    //      optimization -O2                7754                7838 [i]
    //      optimization -O3                7777                7673 [i]
    //
    // no inline keyword
    //      no optimization                11807               14602 [-]
    //      optimization -O1                4651                7691 [-]
    //      optimization -O2                7755                7383 [-]
    //      optimization -O3                7921                7432 [-]
    //
    // Note that in all cases, both results were reported as -213.458.
    //
    /* My cut & paste "build & test script" to run on the Linux command prompt...
    
    echo ""; echo ""; echo ""; echo ""; echo ""; echo ""; echo ""; echo ""; echo ""
    rm -f a.out; g++ -c so.cpp so2.cpp; g++ so.o so2.o;
    echo ""; echo "No optimization:---------------"; objdump -d a.out | grep call | grep vprod; a.out
    rm -f a.out; g++ -O1 -c so.cpp so2.cpp; g++ so.o so2.o;
    echo ""; echo "Optimization -O1:---------------"; objdump -d a.out | grep call | grep vprod; a.out
    rm -f a.out; g++ -O2 -c so.cpp so2.cpp; g++ so.o so2.o;
    echo ""; echo "Optimization -O2:---------------"; objdump -d a.out | grep call | grep vprod; a.out
    rm -f a.out; g++ -O3 -c so.cpp so2.cpp; g++ so.o so2.o;
    echo ""; echo "Optimization -O3:---------------"; objdump -d a.out | grep call | grep vprod; a.out
    
    ...if the "objdump -d a.out | grep call | grep vprod" command returns something
    like "call   8048754 <_Z5vprodPdS_S_>", then I know that the call to vprod() is
    NOT inlined, whereas if it returns nothing, I know the call WAS inlined.
    
    */
    
    //========================================================================================
    // File: so.cpp
    
    // Sorry so messy, I didn't bother to clean up the #includes.......
    #include <stdint.h>
    #include <inttypes.h>
    #include <stddef.h> // for NULL
    #include <stdlib.h> // for exit()
    #include <stdio.h>
    #include <stdio.h>
    #include <time.h>
    #include <stdlib.h>
    #include <iostream>
    //#include <mkl.h>
    
    #include "so.h"
    
    // My standin for dsecnd() since I don't have "mkl.h"...
    #include <sys/time.h>
    double dsecnd()
    {
        struct timeval tv;
        if (gettimeofday(&tv,NULL))
        {
            fprintf(stderr,"\ngettimeofday() error\n\n");
            exit(1);
        }
        return tv.tv_sec*1000000 + tv.tv_usec; // ...returns MICROSECONDS
        //return tv.tv_sec + ((double)tv.tv_usec)/1000000; // ...returns SECONDS
    }
    
    //---------------------------------
    
    #ifndef USE_INLINE_KEYWORD
        // We're NOT using the 'inline' keyword, so define vprod() in this
        // file so it can't possibly be inlined where it's called (in the
        // other source file).
        void vprod( double vgr[3], double vg1[3], double vg2[3]){
        //void vprod( double *vgr, double *vg1, double *vg2){
            vgr[0] = vg1[1]*vg2[2]-vg1[2]*vg2[1];
            vgr[1] =-(vg1[0]*vg2[2]-vg1[2]*vg2[0]);
            vgr[2] = vg1[0]*vg2[1]-vg1[1]*vg2[0];
        }
    #endif
    
    int main() {
    
        double v1[3]={1.22, 2.65, 3.65}, v2[3]={6.98, 98.159, 54.65}, vr[3];
        int LC=1000000L;
        int i, N=100;
        double tiempo=0.0, tinicial;
    
        //------------------------------------------------------------------------
        std::cout << "INLINE METHOD: " << std::endl;
    
        tinicial = dsecnd();
    
        for (i=0; i<N; ++i)
            loop_inline(LC,vr,v1,v2);
    
        tiempo = (dsecnd() - tinicial)/N;
        std::cout << "Tiempo Total:             " << tiempo << std::endl;
        std::cout << "Resultado: " << vr[0] << std::endl;
        //------------------------------------------------------------------------
    
        //------------------------------------------------------------------------
        std::cout << "FUNCTION METHOD: " << std::endl;
        tinicial = dsecnd();
    
        for (i=0; i<N; ++i)
            loop_func(LC,vr,v1,v2);
    
        tiempo = (dsecnd() - tinicial)/N;
        std::cout << "Tiempo Total:             " << tiempo << std::endl;
        std::cout << "Resultado: " << vr[0] << std::endl;
        //------------------------------------------------------------------------
    
    //    std::cin.ignore();
        return 0;
    }
    
    //========================================================================================
    // File: so2.cpp
    
    #include "so.h"
    
    #ifdef USE_INLINE_KEYWORD
        inline void vprod( double vgr[3], double vg1[3], double vg2[3]){
        //void vprod( double *vgr, double *vg1, double *vg2){
            vgr[0] = vg1[1]*vg2[2]-vg1[2]*vg2[1];
            vgr[1] =-(vg1[0]*vg2[2]-vg1[2]*vg2[0]);
            vgr[2] = vg1[0]*vg2[1]-vg1[1]*vg2[0];
        }
    #else
        // Not using 'inline' keyword, so just declare (prototype) the
        // function here and define it in the other source file (so it
        // can't possibly be inlined).
        void vprod( double vgr[3], double vg1[3], double vg2[3]);
    #endif
    
    void loop_inline( const int LC, double vgr[3], double vg1[3], double vg2[3]){
    
        for (int i=0; i<LC; i++) {
            vgr[0] = vg1[1]*vg2[2]-vg1[2]*vg2[1];
            vgr[1] =-(vg1[0]*vg2[2]-vg1[2]*vg2[0]);
            vgr[2] = vg1[0]*vg2[1]-vg1[1]*vg2[0];
        }
    }
    
    void loop_func( const int LC, double vgr[3], double vg1[3], double vg2[3]){
    
        for (int i=0; i<LC; i++) {
            vprod (vgr,vg1,vg2);
        }
    }
    
    • 0
    • Reply
    • Share
      Share
      • Share on Facebook
      • Share on Twitter
      • Share on LinkedIn
      • Share on WhatsApp
      • Report

Sidebar

Related Questions

For some reason, after submitting a string like this Jack’s Spindle from a text
I have a string like this: La Torre Eiffel paragonata all&#8217;Everest What PHP function
I have a text area in my form which accepts all possible characters from
I'm trying to select an H1 element which is the second-child in its group
I'm trying to decode HTML entries from here NYTimes.com and I cannot figure out
I am currently running into a problem where an element is coming back from
link Im having trouble converting the html entites into html characters, (&# 8217;) i
I have just tried to save a simple *.rtf file with some websites and
I am trying to understand how to use SyndicationItem to display feed which is
Basically, what I'm trying to create is a page of div tags, each has

Explore

  • Home
  • Add group
  • Groups page
  • Communities
  • Questions
    • New Questions
    • Trending Questions
    • Must read Questions
    • Hot Questions
  • Polls
  • Tags
  • Badges
  • Users
  • Help
  • SEARCH

Footer

© 2021 The Archive Base. All Rights Reserved
With Love by The Archive Base

Insert/edit link

Enter the destination URL

Or link to existing content

    No search term specified. Showing recent items. Search or use up and down arrow keys to select an item.