As far as I assume, std::stack and all such ‘handmade’ stacks work much slower than stack which is applications one.
Maybe there’s a good low-level ‘bicycle’ already? (Stack realization).
Or it’s a good idea to create new thread and use it’s own stack?
And how can I work directly with application stack? (asm {} only?)
The only way in which
std::stackis significantly slower than the processor stack is that it has to allocate memory from the free store. By default, it usesstd::dequefor storage, which allocates memory in chunks as needed. As long as you don’t keep destroying and recreating the stack, it will keep that memory and not need to allocate more unless it grows bigger than before. So structure code like this:rather than:
If, after profiling, you find that it’s still spending too long allocating memory, then you could preallocate a large block so you only need a single allocation when your program starts up (assuming you can find an upper limit for the stack size). You need to get into the innards of the stack to do this, but it is possible by deriving your own stack type. Something like this (not tested):
EDIT: this is quite a gruesome hack, but it is supported by the C++ Standard. More tasteful would be to initialise a stack with a reserved vector, at the cost of an extra allocation. And don’t try to use this class polymorphically – STL containers aren’t designed for that.
Using the processor stack won’t be portable, and on some platforms might make it impossible to use local variables after pushing something – you might end up having to code everything in assembly. (That is an option, if you really need to count every last cycle and don’t need portability, but make sure you use a profiler to check that it really is worthwhile). There’s no way to use another thread’s stack that will be faster than a stack container.