For the the following code:
long buf[64];
register long rrax asm ("rax");
register long rrbx asm ("rbx");
register long rrsi asm ("rsi");
rrax = 0x34;
rrbx = 0x39;
__asm__ __volatile__ ("movq $buf,%rsi");
__asm__ __volatile__ ("movq %rax, 0(%rsi);");
__asm__ __volatile__ ("movq %rbx, 8(%rsi);");
printf( "buf[0] = %lx, buf[1] = %lx!\n", buf[0], buf[1] );
I get the following output:
buf[0] = 0, buf[1] = 346161cbc0!
while it should have been:
buf[0] = 34, buf[1] = 39!
Any ideas why it is not working properly, and how to solve it?
You clobber memory but don’t tell GCC about it, so GCC can cache values in
bufacross assembly calls. If you want to use inputs and outputs, tell GCC about everything.You also generally want to let GCC handle most of the
mov, register selection, etc — even if you explicitly constrain the registers (rrax is stil%rax) let the information flow through GCC or you will get unexpected results.__volatile__is wrong.The reason
__volatile__exists is so you can guarantee that the compiler places your code exactly where it is… which is a completely unnecessary guarantee for this code. It’s necessary for implementing advanced features such as memory barriers, but almost completely worthless if you are only modifying memory and registers.GCC already knows that it can’t move this assembly after
printfbecause theprintfcall accessesbuf, andbufcould be clobbered by the assembly. GCC already knows that it can’t move the assembly beforerrax=0x39;becauseraxis an input to the assembly code. So what does__volatile__get you? Nothing.If your code does not work without
__volatile__then there is an error in the code which should be fixed instead of just adding__volatile__and hoping that makes everything better. The__volatile__keyword is not magic and should not be treated as such.Alternative fix:
Is
__volatile__necessary for your original code? No. Just mark the inputs and clobber values correctly.Why
__volatile__doesn’t help you here:GCC is well within its rights to completely delete the above line, since the code in the question above claims that it never uses
rrax.A clearer example
The disassembly is more or less as you expect it at
-O0,But with optimization off, you can be fairly sloppy about assembly. Let’s try
-O2:Whoops! Where did
rax = 5;go? It’s dead code, since%raxis never used in the function — at least as far as GCC knows. GCC doesn’t peek inside assembly. What happens when we remove__volatile__?Well, you might think
__volatile__is doing you a service by keeping GCC from discarding your precious assembly, but it’s just masking the fact that GCC thinks your assembly isn’t doing anything. GCC thinks your assembly takes no inputs, produces no outputs, and clobbers no memory. You had better straighten it out:Now we get the following output:
Better. But if you tell GCC about the inputs, it will make sure that
%raxis properly initialized first:The output, with optimizations:
Correct! And we don’t even need to use
__volatile__.Why does
__volatile__exist?The primary correct use for
__volatile__is if your assembly code does something else besides input, output, or clobbering memory. Perhaps it messes with special registers which GCC doesn’t know about, or affects IO. You see it a lot in the Linux kernel, but it’s misused very often in user space.The
__volatile__keyword is very tempting because we C programmers often like to think we’re almost programming in assembly language already. We’re not. C compilers do a lot of data flow analysis — so you need to explain the data flow to the compiler for your assembly code. That way, the compiler can safely manipulate your chunk of assembly just like it manipulates the assembly that it generates.If you find yourself using
__volatile__a lot, as an alternative you could write an entire function or module in an assembly file.