I have a ringbuffer that’s written to by one producer and read by N consumers. As it’s a ringbuffer it’s ok for the index being written to by the producer to be less than the current minimum index of the consumers. The position of the producer and consumers is tracked by their own Cursor.
class Cursor
{
public:
inline int64_t Get() const { return iValue; }
inline void Set(int64_4 aNewValue)
{
::InterlockedExchange64(&iValue, aNewValue);
}
private:
int64_t iValue;
};
//
// Returns the ringbuffer position of the furthest-behind Consumer
//
int64_t GetMinimum(const std::vector<Cursor*>& aCursors, int64_t aMinimum = INT64_MAX)
{
for (auto c : aCursors)
{
int64_t next = c->Get();
if (next < aMinimum)
{
aMinimum = next;
}
}
return aMinimum;
}
Looking at the generated assembly code I see:
mov rax, 922337203685477580 // rax = INT64_MAX
cmp rdx, rcx // Is the vector empty?
je SHORT $LN36@GetMinimum
npad 10
$LL21@GetMinimum:
mov r8, QWORD PTR [rdx] // r8 = c
cmp QWORD PTR [r8+56], rax // compare result of c->Get() and aMinimum
cmovl rax, QWORD PTR [r8+56] // if it's less then aMinimum = result of c->Get()
add rdx, 8 // next vector element
cmp rdx, rcx // end of the vector?
jne SHORT $LL21@GetMinimum
$LN36@GetMinimum:
fatret 0 // beautiful friend, the end
I cannot see how the compiler thinks it’s ok to read the value of c->Get(), compare it to the aMinimum and then conditionally move the RE-READ value of c->Get() into aMinimum. In my mind it’s possible that this value has been changed between the cmp and cmovl instructions. If I’m correct then the following scenario is possible:
-
aMinimumis currently set to 2 -
c->Get()returns 1 -
the
cmpis done and theless-thanflag is set -
another thread updates the value currently held by the current
cto 3 -
cmovlsetsaMinimumto 3 -
the Producer sees 3 and overwrites the data in position 2 of the ringbuffer even though it has not been processed yet.
Have I been looking at it for too long? Shouldn’t it be something like:
mov rbx, QWORD PTR [r8+56]
cmp rbx, rax
cmovl rax, rbx
You aren’t using atomics or any kind of interthread sequencing operations around your access to
iValue(presumably the same would be true of whatever might be modifyingiValueon another thread, but we’ll see that that doesn’t matter), so the compiler is free to assume that it will remain unchanged between the two assembly lines of code. If another thread modifiesiValueyou have undefined behavior.If your code is intended to be be threadsafe, then you’ll need to use atomics, locks or some sequencing operation.
The C++11 standard formalizes this in section 1.10 “Multi-threaded executions and data races”, which is not particularly light reading. I think the parts relevant to this example are:
Paragraph 10:
If we say that evaluation A corresponds to the
Cursor::Get()function and evaluation B would correspond to some unseen code that modifiesiValue. Evaluation A (Cursor::Get()) performs no operation on an atomic object and isn’t dependency ordered before anything else (so there’s no “X” involved here).And if we say that evaluation A corresponds to the code that modifies
iValueand B corresponds toCursor::Get(), the same conclusion can be drawn. So there is no “dependency-ordered before” relation betweenCursor::Get()and the modifier ofiValue.Therefore,
Cursor::Get()isn’t dependency ordered before whatever might modifyiValue.Paragraph 11:
Again, none of those conditions is met, so there’s no inter-thread happens before.
Paragraph 12
We’ve shown that neither operation “inter-thread happens before” the other. And the term “sequenced before” is defined in 1.9/13 “Program execution” as applying only to evaluations that occur on a single thread (“sequenced before” is C++11’s replacement for the the old “sequence point” terminology). Since we’re talking about operations on separate threads, A cannot be sequenced before B.
So at this point, we find that
Cursor::Get()does not “happen before” aniValuemodification that occurs on another thread (and vice-versa). Finally we get to the bottom line for this in paragraph 21:So, if you want to use
Cursor::Get()on one thread and something modifyingiValueon another thread, you need to use atomics or some other sequencing operation (mutex or such) to avoid undefined behavior.Note that according to the standard,
volatileisn’t enough to provide sequencing between threads. Microsoft’s compiler may provide some additional promises tovolatileto support well-defined interthread behavior, but that support is configurable so my suggestion would be to avoid relying onvolatilefor new code. Here’s a bit of what MSDN has to say about this (http://msdn.microsoft.com/en-us/library/vstudio/12a04hfd.aspx):