In a high-performance computing context, I saw code like the following:
typedef union
{
erts_smp_rwmtx_t rwmtx;
byte cache_line_align_[ERTS_ALC_CACHE_LINE_ALIGN_SIZE(sizeof(erts_smp_rwmtx_t))];
}erts_meta_main_tab_lock_t;
erts_meta_main_tab_lock_t main_tab_lock[16];
What does the cache_line_align_ appearing above do? Why is it useful?
When creating a program with multi-threading two parts of your program may try to access objects/primitives in memory that are right next to each other in a shared-memory system.
Unfortunately, the computer hardware can not serve the one piece of memory to one processor and another piece to a different processor at the same time if both pieces are on the same cache-line.
This pitfall is known as False Sharing.
To overcome this, we can add in a buffer of space between the memory locations (variables) in question.
That buffer of space does absolutely nothing, except spread our desired variables apart in memory.
There may be some non-multi threaded programs which have been optimized for cache perfomance where you could see some similar trick, but in that case you usually want the variables that you will access most as close together as possible so they all fit on the same cache-line(s).