When you return an stl vector from a function:
vector<int> getLargeArray() { ... }
Is the return going to be an expensive copy operation? I remember reading somewhere that vector assignment being fast — should I require the caller to pass a reference instead?
void getLargeArray( vector<int>& vec ) { ... }
Assuming your function constructs and returns new data, you should return by value, and try to make sure that the function itself has one return point that returns a variable of type
vector<int>, or at worst several return points that all return the same variable.That ensures that you’ll get the named return value optimization on any credible compiler, which eliminates one of the potential copies (the one from the value in the function, to the return value). There are other ways to get a return value optimization, but it’s not wholly predictable so the simple rule plays safe.
Next, you want to eliminate the potential copy from the return value to whatever the caller does with it. It’s the caller’s problem to solve, not the callee’s, and there are basically three ways to do this:
vector<int>, in which case again any credible C++ compiler will elide the copy.vectorhas move semantics.That is, in C++03 don’t write
Instead:
This avoids the copy assignment that’s required (must not be elided[*]) for
v = getLargeArray(). It’s not needed in C++11, where there’s a cheap move assignment instead of the expensive copy assignment, but of course it still works.Another thing to consider is whether you actually want
vectoras part of your interface. You could instead perhaps write a function template that takes an output iterator, and writes the data to that output iterator. Callers who want the data in a vector can then pass in the result ofstd::back_inserter, and so can callers who want the data in adequeorlist. Callers who know the size of the data in advance could even pass just a vector iterator (suitablyresize()d first) or a raw pointer to a large enough array, to avoid the overhead ofback_insert_iterator. There are non-template ways of doing the same thing, but they’ll most likely incur a call overhead one way or another. If you’re worried about the cost of copying anintper element, then you’re worried about the cost of a function call per element.If your function doesn’t construct and return new data, but rather it returns the current contents of some existing
vector<int>and isn’t allowed to change the original, then you can’t avoid at least one copy when you return by value. So if the performance of that is a proven problem, then you need to look at some API other than return-by-value. For example you might supply a pair of iterators that can be used to traverse the internal data, a function to look up a value in the vector by index, or even (if the performance problem is so serious as to warrant exposing your internals), a reference to the vector. Obviously in all those cases you change the meaning of the function — now instead of giving the caller “their own data” it provides a view of someone else’s data, which might change.[*] of course the “as if” rule still applies, and one can imagine a C++ implementation that’s smart enough to realise that since this is a vector of a trivially copyable type (
int), and since you haven’t taken any pointers to any elements (I assume), then it can swap instead and the result is “as if” it copied. But I wouldn’t count on it.