Consider the following code which uses non-blocking semantics to pop a stack:
T Stack<T>::pop( )
{
while (1) {
if (top == NULL)
throw std::string(“Cannot pop from empty stack”);
Node* result = top;
if (top && __sync_bool_compare_and_swap(&top, result, top->next)) {
return result->data;
}
}
}
My concern is that if the thread doing the pop got descheduled just before the 2nd if statement and by the time got back its time slice the stack’s empty is my check in 2nd loop good enough to avert a crash? Of course, in the worst case just after the comparison of top with zero the thread could get de-scheduled.
Any opinions appreciated. I am aware of the ABA problem that may also occur.
Firstly, assuming
topis volatile and can be changed by another thread at any point, only take its value once per loop so you won’t get the rug pulled out from under you:This still doesn’t solve the problem of
resulthaving been deleted or otherwise modified between gettingtop‘s value and dereferencing it.You want to use a safe sentinel instead of
result -> next, so the logic is:Whether this still counts as wait-free† depends on whether you can find something useful to do in the intermediate state.
There are plenty of papers to read for more efficient ways than using a sentinel – in effect you’re simulating a two word CAS with a single CAS, since you need to check something about the state of
resultas well as the state oftop. These are much too complicated to reproduce here.Not tested in any way:
As you only inspect or mutate the pointee of result in the one thread at a time, it should be safe (I haven’t used this exact pattern before, and normally I end up thinking of weird cases a couple of days after I design something). Whether this ends up being any better than wrapping a std::deque with pthread_mutex_trylock is worth measuring.
Of course, neither this nor the original is non-blocking anyway – if one thread keeps pulling off the stack, any other thread will go round the loop indefinitely waiting for the CAS to succeed. Fairly unlikely, and easily removed by returning false if the CAS does fail, but you do have to work out what you want the thread to do if it shouldn’t wait. If spinning until something can be dequeued is OK, you don’t need the return code.
† I mostly work on x86/x64, where there’s no such thing as lock free code, as CMPXCHG implicitly locks the bus and takes time proportional to the number of caches to sync. So you can have code which doesn’t spin and wait, but you can’t have code which doesn’t lock.