Consider the following code:
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
struct B
{
T x;
};
#define MAKE_B(x) B<decltype(x)>{ x }
template <class T>
B<T> make_b(T&& x)
{
return B<T> { std::forward<T>(x) };
}
int main()
{
std::cout << "Macro make b" << std::endl;
auto b1 = MAKE_B( A() );
std::cout << "Non-macro make b" << std::endl;
auto b2 = make_b( A() );
}
This outputs the following:
Macro make b
Non-macro make b
Move
Note that b1 is constructed without a move, but the construction of b2 requires a move.
I also need to type deduction, as A in real life usage may be a complex type which is difficult to write explicitly. I also need to be able to nest calls (i.e. make_c(make_b(A()))).
Is such a function possible?
Further thoughts:
N3290 Final C++0x draft page 284:
This elision of copy/move operations,
called copy elision, is permitted in
the following circumstances:when a temporary class object that has
not been bound to a reference (12.2)
would be copied/moved to a class
object with the same cv-unqualified
type, the copy/move operation can be
omitted by constructing the temporary
object directly into the target of the
omitted copy/move
Unfortunately this seems that we can’t elide copies (and moves) of function parameters to function results (including constructors) as those temporaries are either bound to a reference (when passed by reference) or no longer temporaries (when passed by value). It seems the only way to elide all copies when creating a composite object is to create it as an aggregate. However, aggregates have certain restrictions, such as requiring all members be public, and no user defined constructors.
I don’t think it makes sense for C++ to allow optimizations for POD C-structs aggregate construction but not allow the same optimizations for non-POD C++ class construction.
Is there any way to allow copy/move elision for non-aggregate construction?
My answer:
This construct allows for copies to be elided for non-POD types. I got this idea from David Rodríguez’s answer below. It requires C++11 lambdas. In this example below I’ve changed make_b to take two arguments to make things less trivial. There are no calls to any move or copy constructors.
#include <iostream>
#include <type_traits>
struct A
{
A() {}
A(const A&) { std::cout << "Copy" << std::endl; }
A(A&&) { std::cout << "Move" << std::endl; }
};
template <class T>
class B
{
public:
template <class LAMBDA1, class LAMBDA2>
B(const LAMBDA1& f1, const LAMBDA2& f2) : x1(f1()), x2(f2())
{
std::cout
<< "I'm a non-trivial, therefore not a POD.\n"
<< "I also have private data members, so definitely not a POD!\n";
}
private:
T x1;
T x2;
};
#define DELAY(x) [&]{ return x; }
#define MAKE_B(x1, x2) make_b(DELAY(x1), DELAY(x2))
template <class LAMBDA1, class LAMBDA2>
auto make_b(const LAMBDA1& f1, const LAMBDA2& f2) -> B<decltype(f1())>
{
return B<decltype(f1())>( f1, f2 );
}
int main()
{
auto b1 = MAKE_B( A(), A() );
}
If anyone knows how to achieve this more neatly I’d be quite interested to see it.
Previous discussion:
This somewhat follows on from the answers to the following questions:
Can creation of composite objects from temporaries be optimised away?
Avoiding need for #define with expression templates
Eliminating unnecessary copies when building composite objects
As Anthony has already mentioned, the standard forbids copy elision from the argument of a function to the return of the same function. The rationale that drives that decision is that copy elision (and move elision) is an optimization by which two objects in the program are merged into the same memory location, that is, the copy is elided by having both objects be one. The (partial) standard quote is below, followed by a set of circumstances under which copy elision is allowed, which do not include that particular case.
So what makes that particular case different? The difference is basically that the fact that there is a function call between the original and the copied objects, and the function call implies that there are extra constraints to consider, in particular the calling convention.
Given a function
T foo( T ), and a user callingT x = foo( T(param) );, in the general case, with separate compilation, the compiler will create an object$tmp1in the location that the calling convention requires the first argument to be. It will then call the function and initializexfrom the return statement. Here is the first opportunity for copy elision: by carefully placingxon the location where the returned temporary is,xand the returned object fromfoobecome a single object, and that copy is elided. So far so good. The problem is that the calling convention in general will not have the returned object and the parameter in the same location, and because of that,$tmp1andxcannot be a single location in memory.Without seeing the function definition the compiler cannot possibly know that the only purpose of the argument to the function is to serve as return statement, and as such it cannot elide that extra copy. It can be argued that if the function is
inlinethen the compiler would have the missing extra information to understand that the temporary used to call the function, the returned value andxare a single object. The problem is that that particular copy can only be elided if the code is actually inlined (not only if it is marked asinlinebut actually inlined) If a function call is required, then the copy cannot be elided. If the standard allowed that copy to be elided when the code is inlined, it would imply that the behavior of a program would differ due to the compiler and not user code –theinlinekeyword does not force inlining, it only means that multiple definitions of the same function do not represent a violation of the ODR.Note that if the variable was created inside the function (as compared to passed into it) as in:
T foo() { T tmp; ...; return tmp; } T x = foo();then both copies can be elided: There is no restriction as of wheretmphas to be created (it is not an input or output parameter to the function so the compiler is able to relocate it anywhere, including the location of the returned type, and on the calling side,xcan as in the previous example be carefully located in the location of that same return statement, which basically means thattmp, the return statement andxcan be a single object.As of your particular problem, if you resort to a macro, the code is inlined, there are no restrictions on the objects and the copy can be elided. But if you add a function, you cannot elide the copy from the argument to the return statement. So just avoid it. Instead of using a template that will move the object, create a template that will construct an object:
And that copy can be elided by the compiler.
Note that I have not dealt with move construction, as you seem concerned on the cost of even move construction, even though I believe that you are barking at the wrong tree. Given a motivating real use case, I am quite sure that people here will come up with a couple of efficient ideas.
12.8/31