I wrote this little program in c++ to in order check CPU load scenarios.
#include <math.h>
#include <stdio.h>
#include <stdlib.h>
#include <windows.h>
#include <time.h>
int main()
{
double x = 1;
int t1 = GetTickCount();
srand(10000);
for (unsigned long i = 0; i < 10000000; i++)
{
int r = rand();
double l = sqrt((double)r);
x *= log(l/3) * pow(x, r);
}
int t2 = GetTickCount();
printf("Time: %d\r\n", t2-t1);
getchar();
}
I compiled it both for x86 and for x64 on win7 x64.
For some reason when I ran the x64 version it finished running in about 3 seconds
but when I tried it with the x86 version it took 48 (!!!) seconds.
I tried it many times and always got similar results.
What could cause this difference?
Looking at the assembler output with
/Ox(maximum optimizations), the speed difference between the x86 and x64 build is obvious:We see that x87 instructions are being used for this computation. Compare this to the x64 build:
Now we see SSE instructions being used instead.
You can pass
/arch:SSE2to try and massage Visual Studio 2010 to produce similar instructions, but it appears the 64bit compiler simply producesmuch betterfaster assembly for your task at hand.Finally, if you relax the floating point model the x86 and x64 perform nearly identically.
Timings, unscientific best of 3:
/Ox: 22704 ticks/Ox: 822 ticks/Ox /arch:SSE2: 3432 ticks/Ox /favor:INTEL64: 1014 ticks/Ox /arch:SSE2 /fp:fast: 834 ticks