I have this very simple code:
#include <stdio.h>
#include <math.h>
int main()
{
long v = 35;
double app = (double)v;
app /= 100;
app = log10(app);
printf("Calculated log10 %lf\n", app);
return 0;
}
This code works perfectly on x86, but doesn’t work on arm, on which the result is 0.00000. Some ideas?
Other info:
Operating system: linux 3.2.27
I build arm toolchain with ct-ng: arm-unknown-linux-gnueabi-
libc version 2.13
Output of gcc -v:
Using built-in specs.
COLLECT_GCC=arm-unknown-linux-gnueabi-gcc
COLLECT_LTO_WRAPPER=/opt/x-tools/arm-unknown-linux-gnueabi/libexec/gcc/arm-unknown-linux-gnueabi/4.5.1/lto-wrapper
Target: arm-unknown-linux-gnueabi
Configured with: /home/mirko/misc/rasppi-ct-ng-files/.build/src/gcc-4.5.1/configure –build=x86_64-build_unknown-linux-gnu –host=x86_64-build_unknown-linux-gnu –target=arm-unknown-linux-gnueabi –prefix=/opt/x-tools/arm-unknown-linux-gnueabi –with-sysroot=/opt/x-tools/arm-unknown-linux-gnueabi/arm-unknown-linux-gnueabi//sys-root –enable-languages=c –disable-multilib –with-pkgversion=crosstool-NG-1.9.3 –enable-__cxa_atexit –disable-libmudflap –disable-libgomp –disable-libssp –with-host-libstdcxx=’-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm’ –with-gmp=/home/mirko/misc/rasppi-ct-ng-files/.build/arm-unknown-linux-gnueabi/build/static –with-mpfr=/home/mirko/misc/rasppi-ct-ng-files/.build/arm-unknown-linux-gnueabi/build/static –with-mpc=/home/mirko/misc/rasppi-ct-ng-files/.build/arm-unknown-linux-gnueabi/build/static –with-ppl=/home/mirko/misc/rasppi-ct-ng-files/.build/arm-unknown-linux-gnueabi/build/static –with-cloog=/home/mirko/misc/rasppi-ct-ng-files/.build/arm-unknown-linux-gnueabi/build/static –with-libelf=/home/mirko/misc/rasppi-ct-ng-files/.build/arm-unknown-linux-gnueabi/build/static –enable-threads=posix –enable-target-optspace –with-local-prefix=/opt/x-tools/arm-unknown-linux-gnueabi/arm-unknown-linux-gnueabi//sys-root –disable-nls –enable-symvers=gnu –enable-c99 –enable-long-long
Thread model: posix
gcc version 4.5.1 (crosstool-NG-1.9.3)
Floating point support on ARM Linux distributions is not trivial. Because of that you should use a toolchain matching your system that is operating system & hardware and use the right compile switches.
First thing you need to understand ARM’s calling convention which is about “how arguments are passed when you call a function?”. ARM being a RISC architecture, can only work on registers. There are no instructions manipulating memory directly. If you need to change a value in memory you first need to load it to a register, modify it, then you need to store it back on the memory.
When you call a function you may need to pass arguments to it, you can put arguments on stack (memory) but since ARM can only work with registers first thing your function would probably do will be loading them back to registers. To avoid this waste ARM calling convention uses registers to pass arguments. However since ARM has a limited number of registers, calling convention also dictates you to use only first four (r0-r3) registers for the first four arguments, remaining must still use stack to be passed.
Second thing is early ARM cores didn’t have any floating point support, operations where implemented in software. (This is what is still supported via gcc’s
-mfloat-abi=soft.)We can easily demonstrate what this means via following snippet.
Compiling this via
-c -O3 -mfloat-abi=softandobdumping gives usAs you can see (actually it is not visible 🙂 )
pi2gets its parameter inr0, populatespi constantonr1and uses__aeabi_fmulto multiply those and return result inr0. Since__aeabi_fmulalso uses same calling convention, details aboutr0is not visible. All our function does to populater1and delegate it to__aeabi_fmul.When floating hardware support added to ARM (again because of architecture style), it came with its own set of registers (s0, s1, …).
If we compile same snippet with
-c -O3 -mfloat-abi=softfpand dump we getAs you can see now compiler doesn’t create a call to
__aeabi_fmulbut instead it creates avmul.f32instruction after it moves argument located inr0tos14and populates3.14ons15. After multiplication instruction it moves result available ins14back tor0since any caller of this function would expect it because of the calling convention.Now if you think
pi2as a library provided to you by some third party, you can understand that both soft and softfp implementations do the same thing for you and you can use them interchangeably. If system provides them for you, you wouldn’t care if your app runs on a system with hardware floating point support or not. This was quite good to keep old software running on new hardware.However while keeping compability this approach introduces the overhead of moving values between ARM registers and FP registers. This obviously effects performance and addressed by a new calling convention, called
hardbygcc. This new convention states that if you have floating point arguments in your function you can utilize floating point registers interleaved with normal ones, as well as you can return floating point values in floating point registers0.Again if we compile our snippet with
-c -O3 -mfloat-abi=hardand dump we getYou can see there is no registers getting moved around. Argument to
pi2gets passed ins0, compiler created code to populate3.14ins15and usesvmul.f32 s0, s0, s15to get result we want ins0.Big problem with this new convention is while you improve the code produced by compiler you completely kill compability. You can’t expect an application built with
hardconvention to work with libraries built forsoft/softfpand an application built for softfp won’t work with libraries built for hard.For more information on calling conventions you should check ARM’s website.