I was reading ‘C++ Template complete guide’ book, part about meta programming. There is

Question

0

Asked: June 2, 20262026-06-02T01:05:26+00:00 2026-06-02T01:05:26+00:00

I was reading ‘C++ Template complete guide’ book, part about meta programming. There is

0

I was reading ‘C++ Template complete guide’ book, part about meta programming. There is an example of loop unroll (17.7). I’ve implemented the program for dot product calculations:

#include <iostream>
#include <sys/time.h>

using namespace std;

template<int DIM, typename T>
struct Functor
{
    static T dot_product(T *a, T *b)
    {
        return *a * *b + Functor<DIM - 1, T>::dot_product(a + 1, b + 1);
    }
};

template<typename T>
struct Functor<1, T>
{
    static T dot_product(T *a, T *b)
    {
        return *a * *b;
    }
};


template<int DIM, typename T>
T dot_product(T *a, T *b)
{
    return Functor<DIM, T>::dot_product(a, b);
}

double dot_product(int DIM, double *a, double *b)
{
    double res = 0;
    for (int i = 0; i < DIM; ++i)
    {
        res += a[i] * b[i];
    }
    return res;
}


int main(int argc, const char * argv[])
{
    static const int DIM = 100;

    double a[DIM];
    double b[DIM];

    for (int i = 0; i < DIM; ++i)
    {
        a[i] = i;
        b[i] = i;
    }


    {
        timeval startTime;
        gettimeofday(&startTime, 0);

        for (int i = 0; i < 100000; ++i)
        {
            double res = dot_product<DIM>(a, b); 
            //double res = dot_product(DIM, a, b);
        }

        timeval endTime;
        gettimeofday(&endTime, 0);

        double tS = startTime.tv_sec * 1000000 + startTime.tv_usec;
        double tE = endTime.tv_sec   * 1000000 + endTime.tv_usec;

        cout << "template time: " << tE - tS << endl;
    }

    {
        timeval startTime;
        gettimeofday(&startTime, 0);

        for (int i = 0; i < 100000; ++i)
        {
            double res = dot_product(DIM, a, b);
        }

        timeval endTime;
        gettimeofday(&endTime, 0);

        double tS = startTime.tv_sec * 1000000 + startTime.tv_usec;
        double tE = endTime.tv_sec   * 1000000 + endTime.tv_usec;

        cout << "loop time: " << tE - tS << endl;
    }

    return 0;
}

I’m using xcode and I turned all code optimisations off. I expected that template version have to be faster then simple loop according to the book. But the results are (t – Template, l = Loop):

DIM 5: t = ~5000, l = ~3500

DIM 50: t = ~55000, l = 16000

DIM 100: t = 130000, l = 36000

Also i’ve tried to make template functions inline with no performance difference.

Why simple loop is so much faster?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-02T01:05:27+00:00

Depending on the compiler, if you don’t turn on performance optimizations, loop unrolling might not occur.

It’s pretty easy to understand why: your recursive template instantiations are basically creating a series of functions. The compiler can’t turn all of that into an inlined, unrolled loop and still keep sensible debugging information available. Suppose a segfault happens somewhere inside one of your functions, or an exception is thrown? Wouldn’t you want to be able to get a stack-trace that showed each frame? The compiler thinks you might want that, unless you turn on optimizations, which gives your compiler permission to go to town on your code.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I was reading ‘C++ Template complete guide’ book, part about meta programming. There is

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply