Consider the following:
struct Point {double x; double y;};
double complexComputation(const& Point p1, const Point& p2)
{
// p1 and p2 used frequently in computations
}
Do compilers optimize the pass-by-reference into pass-by-copy to prevent frequent dereferencing? In other words convert complexComputation into this:
double complexComputation(const& Point p1, const Point& p2)
{
double x1 = p1.x; double x2 = p2.x;
double y1 = p1.y; double y2 = p2.y;
// x1, x2, y1, y2 stored in registers and used frequently in computations
}
Since Point is a POD, there can be no side effect by making a copy behind the caller’s back, right?
If that’s the case, then I can always just pass POD objects by const reference, no matter how small, and not have to worry about the optimal passing semantics. Right?
EDIT:
I’m interested in the GCC compiler in particular. I guess I might have to write some test code and look at the ASM.
There are 2 issues.
Firstly, the compiler will not convert pass-by-ref to pass-by-value, especially if
complexComputationis notstatic(i.e. can be used by external objects).The reason is API compatibility. To the CPU, there is no such thing as a “reference”. The compiler will convert references to pointers. Parameters are passed on stack or via register, so a code calling
complexComputationwill likely be called as (assumedoubleis of length 4 for a moment):Only 8 bytes are pushed onto the stack.
Pass by copy, on the other hand, will push the whole struct onto the stack, so the assembly code will look like
Note that this time 16 bytes are pushed onto the stack, and the content are the numbers, not pointers. If the
complexComputationchanges its parameter passing semantics, the input will become garbage and your program may crash.On the other hand, the optimization
can be easily done, since the compiler can recognize what variables are used very often and
store them into reserved registers (e.g. r4 ~ r13 in the ARM architecture, and many of the sXX/dXX registers) for faster access.
After all, if you want to know if a compiler has done something, you can always disassemble the resulting objects and compare.