I recently stumbled upon this Wikipedia article. From my experience with multi-threading I am aware of the multitude of issues caused by the program being able to switch threads between threads at any time. However, I never knew that compiler and hardware optimisations could reorder operations in a way that is guaranteed to work for a single thread, but not necessarily for multi-threading. Can anyone explain how to correctly deal with the possibility of reordered operations in a multi-threaded environment?
UPDATE: I originally had accidentally linked to the Out-of-Order Execution article instead of the Memory barrier article, which has a better explanation of the problem.
I will address your question as one about multithreading in a high-level language, rather than discussing CPU pipeline optimization.
Most, if not all, modern high-level multithreaded languages provide constructs for managing this potential for the compiler to reorder the logical execution of instructions. In C#, these include field-level constructs (
volatilemodifier), block-level constructs (lockkeyword), and imperative constructs (Thead.MemoryBarrier).Applying
volatileto a field causes all access to that field in the CPU/memory to be executed in the same relative order in which it occurs in the instruction sequence (source code).Using
lockaround a block of code causes the enclosed instruction sequence to be executed in the same relative order in which it occurs in the parent block of code.The
Thread.MemoryBarriermethod indicates to the compiler that the CPU must not reorder memory access around this point in the instruction sequence. This enables a more advanced technique for specialized requirements.The techniques above are described in order of increasing complexity and performance. As with all concurrency programming, determining when and where to apply these techniques is the challenge. When synchronizing access to a single field, the
volatilekeyword will work, but it could prove to be overkill. Sometimes you only need to synchronize writes (in which case aReaderWriterLockSlimwould accomplish the same thing with much better performance). Sometimes you need to manipulate the field multiple times in quick succession, or you must check a field and conditionally manipulate it. In these cases, thelockkeyword is a better idea. Sometimes you have multiple threads manipulating shared state in a very loosely-synchronized model to improve performance (not typically recommended). In that case, carefully placed memory barriers can prevent stale and inconsistent data from being used in threads.