Possible Duplicate:
Why we need Thread.MemoryBarrier()?
From O’Reilly’s C# in a Nutshell:
class Foo
{
int _answer;
bool _complete;
void A()
{
_answer = 123;
Thread.MemoryBarrier(); // Barrier 1
_complete = true;
Thread.MemoryBarrier(); // Barrier 2
}
void B()
{
Thread.MemoryBarrier(); // Barrier 3
if (_complete)
{
Thread.MemoryBarrier(); // Barrier 4
Console.WriteLine (_answer);
}
}
}
Suppose methods A and B ran concurrently on different threads:
The author says: “Barriers 1 and 4 prevent this example from writing “0”. Barriers 2 and 3 provide a
freshness guarantee: they ensure that if B ran after A, reading _complete would evaluate
to true.”
My questions are:
- Why Barrier 4 is needed ? Barrier 1 isn’t enough ?
- Why 2 & 3 are needed ?
- From what I understand, the barrier prevent executing instructions prior to its location after its following instructions, am I correct ?
Memory barrier enforces ordering constraint on reads and writes from/to memory: memory access operations before the barrier happen-before the memory access after the barrier.
Barriers 1 and 4 have complementary roles: barrier 1 ensures that the write to
_answerhappens-before the write to_complete, while barrier 4 ensures that the read from_completehappens-before the read from_answer. Imagine barrier 4 isn’t there, but barrier 1 is. While it is guaranteed that123is written to_answerbeforetrueis written to_completesome other thread runningB()may still have its read operations reordered and hence it may read_answerbefore it reads_complete. Similarly if barrier 1 is removed with barrier 4 kept: while the read from_completeinB()will always happen-before the read from_answer,_completecould still be written to before_answerby some other thread runningA().Barriers 2 and 3 provide freshness guarantee: if barrier 3 is executed after barrier 2 then the state visible to the thread running
A()at the point when it executes barrier 2 becomes visible to the thread runningB()at the point when it executes barrier 3. In the absence of any of these two barriersB()executing afterA()completed might not see the changes made byA(). In particular barrier 2 prevents the value written to_completefrom being cached by the processor runningA()and forces the processor to write it out to the main memory. Similarly, barrier 3 prevents the processor runningB()from relying on cache for the value of_completeforcing a read from the main memory. Note however that stale cache isn’t the only thing which can prevent freshness guarantee in the absence of memory barriers 2 and 3. Reordering of operations on the memory bus is another example of such mechanism.Memory barrier just ensures that the effects of memory access operations are ordered across the barrier. Other instructions (e.g. increment a value in a register) may still be reordered.