I’m testing the following code:
#include <iostream>
#include <vector>
#include <algorithm>
#include <ctime>
int main(int argc, char* argv[])
{
std::vector<int> v(10000000);
clock_t then = clock();
if(argc <= 1)
std::for_each(v.begin(), v.end(), [](int& it){ it = 10098; });
else
for(auto it = v.begin(); it != v.end(); ++it) *it = 98775;
std::cout << clock() - then << "\n";
return 0;
}
I’m compiling it with g++ 4.6, without any optimization flags and here is what I get:
[javadyan@myhost experiments]$ ./a.out
260000
[javadyan@myhost experiments]$ ./a.out aaa
330000
[javadyan@myhost experiments]$
Using -O1 optimization yields the following (unsurprising) results:
[javadyan@myhost experiments]$ ./a.out
20000
[javadyan@myhost experiments]$ ./a.out aaa
20000
I’m running Linux 3.0 on a dualcore 2Ghz laptop, if that matters.
What I’m wondering is how in a program compiled without any optimizations a call to for_each with a lambda function could eat less clocks than a plain for loop? Shouldn’t there be even a slight overhead from calling the anonymous function? Is there any documentation on how code like this
std::for_each(v.begin(), v.end(), [](int& it){ it = 10098; });
is handled by g++? What is the behavior of other popular compilers in this case?
UPDATE
I didn’t consider the fact that it in the second expression gets compared to v.end() on every iteration. With that fixed, the for loop eats less clocks than for_each. However, I’m still curious about how the compiler optimizes the for_each when -O1 flag is used.
From a first glance, I can say at least that those expressions are not equivalent. Try with this instead:
Also, since the exact type of the lambda is passed to
for_each, there are pretty good chances that the compiler will inline it resulting in code no different than the for loop. Note that there are no virtual calls involved in anonymous functions. The compiler will do something like this:which in addition to inlining, will result in the very same code than the for loop (with my modification included).