Consider the following code, which binds a temporary object to a const reference in “nested” fashion:
#include <iostream>
std::string foo()
{
return "abc";
}
std::string goo()
{
const std::string & a = foo();
return a;
}
int main()
{
// Is a temporary allocated on the heap to support this, even for a moment?
const std::string & b = goo();
}
I have been trying to understand what the compiler must do in terms of memory storage in order to support this “nested” construct.
I suspect that for the call to foo(), memory allocation is straightforward: storage for a std::string will be allocated on the stack as the function foo() exits.
However, what must the compiler do to support storage for the object referenced by b? The stack for the function goo must unwind and “be replaced with” an object on the stack to which b refers, but in order to unwind the stack for goo, will the compiler be required to momentarily create a copy of the object on the heap (before copying it back to the stack in a different location)?
Or is it possible for the compiler to accomplish the requirements of this construct without any storage being allocated on the heap, even for a moment?
Or is it even possible for the compiler to use the same storage location for the object referred to by b as for the object referred to by a, without doing any additional allocation either on the stack or on the heap?
Here is an example of what the C++ standard allows the compiler to rebuild your code as. I’m using full NRVO. Note the use of placement
new, which is a moderately obscure C++ feature. You passnewa pointer, and it constructs the result there instead of in the free store.If we blocked NRVO in
goo, it would instead look likebasically, the compiler knows the lifetime of the references. So it can create “anonymous variables” that store the actual instance of the variable, then create references to it.
I also noted that when you call a function, you effectively (implicitly) pass in a pointer to a buffer to where the return value goes. So the called function constructs the object ‘in place’ in the caller’s scope.
With NRVO, a named variable in the called function scope is actually constructed in the calling functions “where the return value goes”, which makes returning easy. Without it, you have to do everything locally, then at the return statement copy your return value to the implicit pointer to your return value via the equivalent of placement new.
Nothing needs be done on the heap (aka free store), because lifetimes are all easily provable and stack-ordered.
The original
fooandgoowith the expected signature would have to still exist, as they have external linkage, until possibly discarded when it is found that nobody uses them.All variables and functions starting with
__exist for exposition only. The compiler/execution environment no more needs to have a named variable than you need to have a name for a red blood cell. (In theory, because__is reserved, a compiler that did such a translation pass before compiling would probably be legal, and if you actually used those variable names and it failed to compile it would be your fault not the compiler’s fault, but … that would be a pretty hackey compiler. 😉 )