Reading this question, I wanted to test if I could demonstrate the non-atomicity of reads and writes on a type for which the atomicity of such operations is not guaranteed.
private static double _d;
[STAThread]
static void Main()
{
new Thread(KeepMutating).Start();
KeepReading();
}
private static void KeepReading()
{
while (true)
{
double dCopy = _d;
// In release: if (...) throw ...
Debug.Assert(dCopy == 0D || dCopy == double.MaxValue); // Never fails
}
}
private static void KeepMutating()
{
Random rand = new Random();
while (true)
{
_d = rand.Next(2) == 0 ? 0D : double.MaxValue;
}
}
To my surprise, the assertion refused to fail even after a full three minutes of execution.
What gives?
- The test is incorrect.
- The specific timing characteristics of the test make it unlikely/impossible that the assertion will fail.
- The probability is so low that I have to run the test for much longer to make it likely that it will trigger.
- The CLR provides stronger guarantees about atomicity than the C# spec.
- My OS/hardware provides stronger guarantees than the CLR.
- Something else?
Of course, I don’t intend to rely on any behaviour that is not explicitly guaranteed by the spec, but I would like a deeper understanding of the issue.
FYI, I ran this on both Debug and Release (changing Debug.Assert to if(..) throw) profiles in two separate environments:
- Windows 7 64-bit + .NET 3.5 SP1
- Windows XP 32-bit + .NET 2.0
EDIT: To exclude the possibility of John Kugelman’s comment “the debugger is not Schrodinger-safe” being the problem, I added the line someList.Add(dCopy); to the KeepReading method and verified that this list was not seeing a single stale value from the cache.
EDIT:
Based on Dan Bryant’s suggestion: Using long instead of double breaks it virtually instantly.
You might try running it through CHESS to see if it can force an interleaving that breaks the test.
If you take a look at the x86 diassembly (visible from the debugger), you might also see if the jitter is generating instructions that preserve atomicity.
EDIT: I went ahead and ran the disassembly (forcing target x86). The relevant lines are:
It uses a single fstp qword ptr to perform the write operation in both cases. My guess is that the Intel CPU guarantees atomicity of this operation, though I haven’t found any documentation to support this. Any x86 gurus who can confirm this?
UPDATE:
This fails as expected if you use Int64, which uses the 32-bit registers on the x86 CPU rather than the special FPU registers. You can see this below:
UPDATE:
I was curious if this would fail if I forced non-8byte alignment of the double field in memory, so I put together this code:
It does not fail and the generated x86 instructions are essentially the same as before:
I experimented with swapping _d1 and _d2 for usage with dCopy/set and also tried a FieldOffset of 2. All generated the same basic instructions (with different offsets above) and all did not fail after several seconds (likely billions of attempts). I’m cautiously confident, given these results, that at least the Intel x86 CPUs provide atomicity of double load/store operations, regardless of alignment.