Consider a simple loop in :
for(int i=0;i<32;i++)
a[i] = i;
The LLVM disassembler shows the following assembly:
.LBB0_1: # =>This Inner Loop Header: Depth=1
movl %eax, (%esp,%eax,4)
addl $1, %eax
adcl $0, %ecx
cmpl $32, %eax
jne .LBB0_1
# BB#2:
xorl %eax, %eax
addl $140, %esp
ret
Question 1: Can anyone explain movl %eax, (%esp,%eax,4) instruction?
Moreover, Visual Studio disassembler outputs the following assembly:
;for(int i=0;i<32;i++)
00F290B5 mov dword ptr [ebp-94h],0
00F290BF jmp main+60h (0F290D0h)
00F290C1 mov eax,dword ptr [ebp-94h]
00F290C7 add eax,1
00F290CA mov dword ptr [ebp-94h],eax
00F290D0 cmp dword ptr [ebp-94h],20h
00F290D7 jge main+7Eh (0F290EEh)
;a[i] = i;
00F290D9 mov eax,dword ptr [ebp-94h]
00F290DF mov ecx,dword ptr [ebp-94h]
00F290E5 mov dword ptr a[eax*4],ecx
00F290EC jmp main+51h (0F290C1h)
;return 0;
00F290EE xor eax,eax
Obvoiusly the LLVM’s output is more optimized.
Question 2: Is there an option in Visual Studio to optimize the code like LLVM does?
Update:
Results after setting Solution Configurations to Release & Optimization to Full Optimization(/Ox):
; int a[32] = {0};
; for(int i=0;i<32;i++)
0039128B xor eax,eax
0039128D lea ecx,[a]
00391293 movd xmm0,eax
00391297 pshufd xmm0,xmm0,0
0039129C paddd xmm0,xmm1
003912A0 add eax,4
; {
; a[i] = i;
003912A3 movdqu xmmword ptr [ecx],xmm0
003912A7 lea ecx,[ecx+10h]
003912AA cmp eax,20h
003912AD jl main+23h (0391293h)
; }
; return 0;
; };
003912AF mov ecx,dword ptr [ebp-4]
003912B2 xor ecx,ebp
003912B4 xor eax,eax
003912B6 call __security_check_cookie (03916FDh)
003912BB mov esp,ebp
003912BD pop ebp
003912BE ret
movl %eax, (%esp,%eax,4)is just an indirect memory store.It stores
%eaxinto the memory location:%esp + %eax * 4. In this case:%espis the arraya.%eaxis the indexi.4is the size ofint.For your second question, the code output by Visual Studio looks like it was done without optimizations. There’s a lot of excessive memory loads and stores.
For example:
dword ptr [ebp-94h]appears to be the indexivariable. But without optimizations, it never got promoted to a register.Enable optimizations, and you will see that it will produce much more sane code.