I would like some information on how to correctly think about C++11 closures and std::function in terms of how they are implemented and how memory is handled.
Although I don’t believe in premature optimisation, I do have a habit of carefully considering the performance impact of my choices while writing new code. I also do a fair amount of real-time programming, e.g. on microcontrollers and for audio systems, where non-deterministic memory allocation/deallocation pauses are to be avoided.
Therefore I’d like to develop a better understanding of when to use or not use C++ lambdas.
My current understanding is that a lambda with no captured closure is exactly like a C callback. However, when the environment is captured either by value or by reference, an anonymous object is created on the stack. When a value-closure must be returned from a function, one wraps it in std::function. What happens to the closure memory in this case? Is it copied from the stack to the heap? Is it freed whenever the std::function is freed, i.e., is it reference-counted like a std::shared_ptr?
I imagine that in a real-time system I could set up a chain of lambda functions, passing B as a continuation argument to A, so that a processing pipeline A->B is created. In this case, the A and B closures would be allocated once. Although I’m not sure whether these would be allocated on the stack or the heap. However in general this seems safe to use in a real-time system. On the other hand if B constructs some lambda function C, which it returns, then the memory for C would be allocated and deallocated repeatedly, which would not be acceptable for real-time usage.
In pseudo-code, a DSP loop, which I think is going to be real-time safe. I want to perform processing block A and then B, where A calls its argument. Both these functions return std::function objects, so f will be a std::function object, where its environment is stored on the heap:
auto f = A(B); // A returns a function which calls B
// Memory for the function returned by A is on the heap?
// Note that A and B may maintain a state
// via mutable value-closure!
for (t=0; t<1000; t++) {
y = f(t)
}
And one which I think might be bad to use in real-time code:
for (t=0; t<1000; t++) {
y = A(B)(t);
}
And one where I think stack memory is likely used for the closure:
freq = 220;
A = 2;
for (t=0; t<1000; t++) {
y = [=](int t){ return sin(t*freq)*A; }
}
In the latter case the closure is constructed at each iteration of the loop, but unlike the previous example it is cheap because it is just like a function call, no heap allocations are made. Moreover, I wonder if a compiler could “lift” the closure and make inlining optimisations.
Is this correct? Thank you.
No; it is always a C++ object with an unknown type, created on the stack. A capture-less lambda can be converted into a function pointer (though whether it is suitable for C calling conventions is implementation dependent), but that doesn’t mean it is a function pointer.
A lambda isn’t anything special in C++11. It’s an object like any other object. A lambda expression results in a temporary, which can be used to initialize a variable on the stack:
lambis a stack object. It has a constructor and destructor. And it will follow all of the C++ rules for that. The type oflambwill contain the values/references that are captured; they will be members of that object, just like any other object members of any other type.You can give it to a
std::function:In this case, it will get a copy of the value of
lamb. Iflambhad captured anything by value, there would be two copies of those values; one inlamb, and one infunc_lamb.When the current scope ends,
func_lambwill be destroyed, followed bylamb, as per the rules of cleaning up stack variables.You could just as easily allocate one on the heap:
Exactly where the memory for the contents of a
std::functiongoes is implementation-dependent, but the type-erasure employed bystd::functiongenerally requires at least one memory allocation. This is whystd::function‘s constructor can take an allocator.std::functionstores a copy of its contents. Like virtually every standard library C++ type,functionuses value semantics. Thus, it is copyable; when it is copied, the newfunctionobject is completely separate. It is also moveable, so any internal allocations can be transferred appropriately without needing more allocating and copying.Thus there is no need for reference counting.
Everything else you state is correct, assuming that “memory allocation” equates to “bad to use in real-time code”.