I have a simple function that takes two variables by reference:
void foo(int*& it2,
bit_reader<big_endian_tag>& reader2)
{
for(/* ... */)
{
*it2++ = boo(reader2.next());
// it2++ => 0x14001d890 add qword ptr [r12], 0x4
}
}
The problem here is that for it2 and reader2 the optimizer makes the computer write to memory instead of registers during the loop.
However, the following code puts the variables properly into registers during the loop, but has an extra overhead in the form of unnecessary copies, before and after the loop:
void foo2(int*& it2,
bit_reader<big_endian_tag>& reader2)
{
auto reader = reader2;
auto it = it2;
for(/* ... */)
{
*it++ = boo(reader.next());
// it++ => 0x14001d890 add r15, 0x4
}
reader2 = reader;
it2 = it;
}
e.g.
How can I make the first example generate the same code as the second example but without the extra copies?
The problem is that the compiler cannot prove
it2does not change within the function. (Well, it could, but that’s vastly beyond the intended capabilities of a normal C++ compiler.)How does it know
boo(reader2.next());doesn’t change the value? Consider:This does not assign anything to
otherInt, whereas after your transformation it would:So unless the compiler can prove the behavior is the same, it cannot make the optimization.
C99 solves this problem with the
restrictkeyword, but C++ has no equivalent. There are extensions that exist in most C++ compilers though, such as__restrict__or__restrict.To do it in standard C++, you just have to be explicit and make the copy yourself