Could someone explain this statement:
shared variables
x = 0, y = 0
Core 1 Core 2
x = 1; y = 1;
r1 = y; r2 = x;
How is it possible to have r1 == 0 and r2 == 0 on x86 processors?
Sign Up to our social questions and Answers Engine to ask questions, answer people’s questions, and connect with other people.
Login to our social questions & Answers Engine to ask questions answer people’s questions & connect with other people.
Lost your password? Please enter your email address. You will receive a link and will create a new password via email.
Please briefly explain why you feel this question should be reported.
Please briefly explain why you feel this answer should be reported.
Please briefly explain why you feel this user should be reported.
The problem can arise due to optimizations involving reordering of instructions. In other words, both processors can assign
r1andr2before assigning variablesxandy, if they find that this would yield better performance. This can be solved by adding a memory barrier, which would enforce the ordering constraint.To quote the slideshow you mentioned in your post:
Regarding the x86 architecture, the best resource to read is Intel® 64 and IA-32 Architectures Software Developer’s Manual (Chapter 8.2 Memory Ordering). Sections 8.2.1 and 8.2.2 describe the memory-ordering implemented by
Intel486, Pentium, Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium 4, Intel
Xeon, and P6 family processors: a memory model called processor ordering, as opposed to program ordering (strong ordering) of the older Intel386 architecture (where read and write instructions were always issued in the order they appeared in the instruction stream).
The manual describes many ordering guarantees of the processor ordering memory model (such as Loads are not reordered with other loads, Stores are not reordered with other stores, Stores are not reordered with older loads etc.), but it also describes the allowed reordering rule which causes the race condition in the OP’s post:
On the other hand, if the original order of the instructions was switched:
In this case, processor guarantees that
r1 = 1andr2 = 1situation is not allowed (due to 8.2.3.3 Stores Are Not Reordered With Earlier Load guarantee), meaning that those instructions would never be reordered in individual cores.To compare this with different architectures, check out this article: Memory Ordering in Modern Microprocessors. You can see that Itanium (IA-64) does even more reordering than the IA-32 architecture: