I must be just having a moment, because this should be easy but I can’t seem to get it working right.
Whats the correct way to implement an atomic counter in GCC?
i.e. I want a counter that runs from zero to 4 and is thread safe.
I was doing this (which is further wrapped in a class, but not here)
static volatile int _count = 0;
const int limit = 4;
int get_count(){
// Create a local copy of diskid
int save_count = __sync_fetch_and_add(&_count, 1);
if (save_count >= limit){
__sync_fetch_and_and(&_count, 0); // Set it back to zero
}
return save_count;
}
But it’s running from 1 through from 1 – 4 inclusive then around to zero.
It should go from 0 – 3. Normally I’d do a counter with a mod operator but I don’t
know how to do that safely.
Perhaps this version is better. Can you see any problems with it, or offer
a better solution.
int get_count(){
// Create a local copy of diskid
int save_count = _count;
if (save_count >= limit){
__sync_fetch_and_and(&_count, 0); // Set it back to zero
return 0;
}
return save_count;
}
Actually, I should point out that it’s not absolutely critical that each thread get a different value. If two threads happened to read the same value at the same time that wouldn’t be a problem. But they can’t exceed limit at any time.
Your code isn’t atomic (and your second
get_countdoesn’t even increment the counter value)!Say
countis 3 at the start and two threads simultaneously callget_count. One of them will get his atomic add done first and incrementscountto 4. If the second thread is fast enough, it can increment it to5before the first thread resets it to zero.Also, in your wraparound processing, you reset
countto 0 but notsave_count. This is clearly not what’s intended.This is easiest if
limitis a power of 2. Don’t ever do the reduction yourself, just useor alternatively
This only does one atomic operation per invocation, is safe and very cheap. For generic limits, you can still use
%, but that will break the sequence if the counter ever overflows. You can try using a 64-bit value (if your platform supports 64-bit atomics) and just hope it never overflows; this is a bad idea though. The proper way to do this is using an atomic compare-exchange operation. You do this:This approach generalizes to more complicated sequences and update operations too.
That said, this type of lockless operation is tricky to get right, relies on undefined behavior to some degree (all current compilers get this right, but no C/C++ standard before C++0x actually has a well-defined memory model) and is easy to break. I recommend using a simple mutex/lock unless you’ve profiled it and found it to be a bottleneck.