Recently, a coworker pointed out to me that compiling everything into a single file created much more efficient code than compiling separate object files – even with link time optimization turned on. In addition, the total compile time for the project went down significantly. Given that one of the primary reasons for using C++ is code efficiency, this was surprising to me.
Clearly, when the archiver/linker makes a library out of object files, or links them into an executable, even simple optimizations are penalized. In the example below, trivial inlining costs 1.8% in performance when done by the linker instead of the compiler. It seems like compiler technology should be advanced enough to handle fairly common situations like this, but it isn’t happening.
Here is a simple example using Visual Studio 2008:
#include <cstdlib>
#include <iostream>
#include <boost/timer.hpp>
using namespace std;
int foo(int x);
int foo2(int x) { return x++; }
int main(int argc, char** argv)
{
boost::timer t;
t.restart();
for (int i=0; i<atoi(argv[1]); i++)
foo (i);
cout << "time : " << t.elapsed() << endl;
t.restart();
for (int i=0; i<atoi(argv[1]); i++)
foo2 (i);
cout << "time : " << t.elapsed() << endl;
}
foo.cpp
int foo (int x) { return x++; }
Results of run: 1.8% performance hit to using linked foo instead of inline foo2.
$ ./release/testlink.exe 100000000
time : 13.375
time : 13.14
And yes, the linker optimization flags (/LTCG) are on.
I’m not a compiler specialist, but I think the compiler has much more information available at disposal to optimize as it operates on a language tree, as opposed to the linker that has to content itself to operate on the object output, far less expressive than the code the compiler has seen. Hence less effort is spent by linker and compiler development team(s) into making linker optimizations that could match, in theory, the tricks the compiler does.
BTW, I’m sorry I distracted your original question into the ltcg discussion. I now understand your question was a little bit different, more concerned with the link time vs. compile time static optimizations possible/available.