GCC offers a nice set of built-in functions for atomic operations. And being on MacOS or iOS, even Apple offers a nice set of atomic functions. However, all these functions perform an operation, e.g. an addition/subtraction, a logical operation (AND/OR/XOR) or a compare-and-set/compare-and-swap. What I am looking for is a way to atomically assign/read an int value, like:
int a;
/* ... */
a = someVariable;
That’s all. a will be read by another thread and it is only important that a either has its old value or its new value. Unfortunately the C standard does not guarantee that assigning or reading a value is an atomic operation. I remember that I once read somewhere, that writing or reading a value to a variable of type int is guaranteed to be atomic in GCC (regardless the size of int) but I searched everywhere on the GCC homepage and I cannot find this statement any longer (maybe it was removed).
I cannot use sig_atomic_t because sig_atomic_t has no guaranteed size and it might also have a different size than int.
Since only one thread will ever “write” a value to a, while both threads will “read” the current value of a, I don’t need to perform the operations themselves in an atomic manner, e.g.:
/* thread 1 */
someVariable = atomicRead(a);
/* Do something with someVariable, non-atomic, when done */
atomicWrite(a, someVariable);
/* thread 2 */
someVariable = atomicRead(a);
/* Do something with someVariable, but never write to a */
If both threads were going to write to a, then all operations would have to be atomic, but that way, this may only waste CPU time; and we are extremely low on CPU resources in our project. So far we use a mutex around read/write operations of a and even though the mutex is held for such a tiny amount of time, this already causes problems (one of the threads is a realtime thread and blocking on a mutex causes it to fail its realtime constraints, which is pretty bad).
Of course I could use a __sync_fetch_and_add to read the variable (and simply add “0” to it, to not modify its value) and for writing use a __sync_val_compare_and_swap for writing it (as I know its old value, so passing that in will make sure the value is always exchanged), but won’t this add unnecessary overhead?
A
__sync_fetch_and_addwith a 0 argument is indeed the best bet if you want your load to be atomic and act as a memory barrier. Similarly, you can use anandwith 0 or anorwith -1 to store 0 and -1 atomically with a memory barrier. For writing, you can use__sync_test_and_set(actually an xchg operation) if an “acquire” barrier is enough, or if using Clang you can use__sync_swap(which is an xchg operation with a full barrier).However, in many cases that’s overkill and you may prefer to add memory barriers manually. If you do not want the memory barrier, you can use a volatile load to atomically read/write a variable that is aligned and no wider than a word:
(This macro is an lvalue, so you can also use it for a store like
__sync_store(x) = 0). The function implements the same semantics as the C++11memory_order_consumeform, but only under two assumptions:that your machine has coherent caches; if not, you need a memory barrier or global cache flush before the load (or before the first of a group of load).
that your machine is not a DEC Alpha. The Alpha had very relaxed semantics for reordering memory accesses, so on it you’d need a memory barrier after the load (and after each load in a group of loads). On the Alpha the above macro only provides
memory_order_relaxedsemantics. BTW, the first versions of the Alpha couldn’t even store a byte atomically (only a word, which was 8 bytes).In either case, the
__sync_fetch_and_addwould work. As far as I know, no other machine imitated the Alpha so neither assumption should pose problems on current computers.