After reading this discussion I realized that I almost totally misunderstand the matter 🙂
As the description of C++ abstract machine is not rigorous enough(comparing, for instance, with JVM specification), and if a precise answer isn’t possible I would rather want to get informal clarifications about rules that reasonable "good" (non-malicious) implementation should follow.
The key concept of part 1.9 of the Standard addressing implementation freedom is so called as-if rule:
an implementation is free to disregard any requirement of this
Standard as long as the result is as if the requirement had been
obeyed, as far as can be determined from the observable behavior of
the program.
The term "observable behavior", according to the standard (I cite n3092), means the following:
— Access to volatile objects are evaluated strictly according to the
rules of the abstract machine.— At program termination, all data written into files shall be
identical to one of the possible results that execution of the program
according to the abstract semantics would have produced.— The input and output dynamics of interactive devices shall take
place in such a fashion that prompting output is actually delivered
before a program waits for input. What constitutes an interactive
device is implementation-defined.
So, roughly speaking, the order and operands of volatile access operations and io operations should be preserved; implementation may make arbitrary changes in the program which preserve these invariants (comparing to some allowed behaviour of the abstract c++ machine)
-
Is it reasonable to expect that non-malicious implementation treates io operations wide enough (for instance, any system call from user code is treated as such operation)? (E.g. RAII mutex lock/unlock wouldn’t be thrown away by compiler in case RAII wrapper contains no volatiles)
-
How deeply the "behavioral observation" should immerse from user-defined c++ program level into library/system calls? The question is, of course, only about library calls that not intended to have io/volatile access from the user viewpoint (e.g. as new/delete operations) but may (and usually does) access volatiles or io in the library/system implementation. Should the compiler treat such calls from the user viewpoint (and consider such side effects as not observable) or from "library" viewpoint (and consider the side effects as observable) ?
-
If I need to prevent some code from elimination by compiler, is it a good practice not to ask all the questions above and simply add (possibly fake) volatile access operations (wrap the actions needed to volatile methods and call them on volatile instances of my own classes) in any case that seems suspicious?
-
Or I’m totally wrong and the compiler is disallowed to remove any c++ code except of cases explicitly mentioned by the standard (as copy elimination)
The important bit is that the compiler must be able to prove that the code has no side effects before it can remove it (or determine which side effects it has and replace it with some equivalent piece of code). In general, and because of the separate compilation model, that means that the compiler is somehow limited as to what library calls have observable behavior and can be eliminated.
As to the deepness of it, it depends on the library implementation. In gcc, the C standard library uses compiler attributes to inform the compiler of potential side effects (or absence of them). For example,
strlenis tagged with a pure attribute that allows the compiler to transform this code:into
But without the pure attribute the compiler cannot know whether the function has side effects or not (unless it is inlining it, and gets to see inside the function), and cannot perform the above optimization.
That is, in general, the compiler will not remove code unless it can prove that it has no side effects, i.e. will not affect the outcome of the program. Note that this does not only relate to
volatileand io, since any variable change might have observable behavior at a later time.As to question 3, the compiler will only remove your code if the program behaves exactly as if the code was present (copy elision being an exception), so you should not even care whether the compiler removes it or not. Regarding question 4, the as-if rule stands: If the outcome of the implicit refactor made by the compiler yields the same result, then it is free to perform the change. Consider:
The compiler can freely replace that code with:
The loop is gone, but the behavior is the same: each loop interaction does not affect the outcome of the program, and the variable has the correct value at the end of the loop, i.e. if it is later used in some observable operation, the result will be as-if the loop had been executed.
Don’t worry too much on what observable behavior and the as-if rule mean, they basically mean that the compiler must yield the output that you programmed in your code, even if it is free to get to that outcome by a different path.
EDIT
@Konrad raises a really good point regarding the initial example I had with
strlen: how can the compiler know thatstrlencalls can be elided? And the answer is that in the original example it cannot, and thus it could not elide the calls. There is nothing telling the compiler that the pointer returned from theget_string()function does not refer to memory that is being modified elsewhere. I have corrected the example to use a local array.In the modified example, the array is local, and the compiler can verify that there are no other pointers that refer to the same memory.
strlentakes a const pointer and so it promises not to modify the contained memory, and the function is pure so it promises not to modify any other state. The array is not modified inside the loop construct, and gathering all that information the compiler can determine that a single call tostrlensuffices. Without the pure specifier, the compiler cannot know whether the result ofstrlenwill differ in different invocations and has to call it.