GCC offers a nice set of built-in functions for atomic operations. And being on

Question

0

Asked: May 24, 20262026-05-24T22:53:37+00:00 2026-05-24T22:53:37+00:00

GCC offers a nice set of built-in functions for atomic operations. And being on

0

GCC offers a nice set of built-in functions for atomic operations. And being on MacOS or iOS, even Apple offers a nice set of atomic functions. However, all these functions perform an operation, e.g. an addition/subtraction, a logical operation (AND/OR/XOR) or a compare-and-set/compare-and-swap. What I am looking for is a way to atomically assign/read an int value, like:

int a;
/* ... */    
a = someVariable;

That’s all. a will be read by another thread and it is only important that a either has its old value or its new value. Unfortunately the C standard does not guarantee that assigning or reading a value is an atomic operation. I remember that I once read somewhere, that writing or reading a value to a variable of type int is guaranteed to be atomic in GCC (regardless the size of int) but I searched everywhere on the GCC homepage and I cannot find this statement any longer (maybe it was removed).

I cannot use sig_atomic_t because sig_atomic_t has no guaranteed size and it might also have a different size than int.

Since only one thread will ever “write” a value to a, while both threads will “read” the current value of a, I don’t need to perform the operations themselves in an atomic manner, e.g.:

/* thread 1 */
someVariable = atomicRead(a);
/* Do something with someVariable, non-atomic, when done */
atomicWrite(a, someVariable);

/* thread 2 */
someVariable = atomicRead(a);
/* Do something with someVariable, but never write to a */

If both threads were going to write to a, then all operations would have to be atomic, but that way, this may only waste CPU time; and we are extremely low on CPU resources in our project. So far we use a mutex around read/write operations of a and even though the mutex is held for such a tiny amount of time, this already causes problems (one of the threads is a realtime thread and blocking on a mutex causes it to fail its realtime constraints, which is pretty bad).

Of course I could use a __sync_fetch_and_add to read the variable (and simply add “0” to it, to not modify its value) and for writing use a __sync_val_compare_and_swap for writing it (as I know its old value, so passing that in will make sure the value is always exchanged), but won’t this add unnecessary overhead?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-24T22:53:37+00:00

A __sync_fetch_and_add with a 0 argument is indeed the best bet if you want your load to be atomic and act as a memory barrier. Similarly, you can use an and with 0 or an or with -1 to store 0 and -1 atomically with a memory barrier. For writing, you can use __sync_test_and_set (actually an xchg operation) if an “acquire” barrier is enough, or if using Clang you can use __sync_swap (which is an xchg operation with a full barrier).

However, in many cases that’s overkill and you may prefer to add memory barriers manually. If you do not want the memory barrier, you can use a volatile load to atomically read/write a variable that is aligned and no wider than a word:

#define __sync_access(x) (*(volatile __typeof__(x) *) &(x))

(This macro is an lvalue, so you can also use it for a store like __sync_store(x) = 0). The function implements the same semantics as the C++11 memory_order_consume form, but only under two assumptions:

that your machine has coherent caches; if not, you need a memory barrier or global cache flush before the load (or before the first of a group of load).
that your machine is not a DEC Alpha. The Alpha had very relaxed semantics for reordering memory accesses, so on it you’d need a memory barrier after the load (and after each load in a group of loads). On the Alpha the above macro only provides memory_order_relaxed semantics. BTW, the first versions of the Alpha couldn’t even store a byte atomically (only a word, which was 8 bytes).

In either case, the __sync_fetch_and_add would work. As far as I know, no other machine imitated the Alpha so neither assumption should pose problems on current computers.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

GCC offers a nice set of built-in functions for atomic operations. And being on

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply