I’m using the GCC SIMD vector extension for a project, everything works quite quite well but casts, they simply reset all the components of a vector.
The manual states:
It is possible to cast from one vector type to another, provided they are of the same size (in fact, you can also cast vectors to and from other datatypes of the same size).
Here’s a simple example:
#include <stdio.h>
typedef int int4 __attribute__ (( vector_size( sizeof( int ) * 4 ) ));
typedef float float4 __attribute__ (( vector_size( sizeof( float ) * 4 ) ));
int main()
{
int4 i = { 1 , 2 , 3 , 4 };
float4 f = { 0.1 , 0.2 , 0.3 , 0.4 };
printf( "%i %i %i %i\n" , i[0] , i[1] , i[2] , i[3] );
printf( "%f %f %f %f\n" , f[0] , f[1] , f[2] , f[3] );
f = ( float4 )i;
printf( "%f %f %f %f\n" , f[0] , f[1] , f[2] , f[3] );
}
Compiling with gcc cast.c -O3 -o cast and running on my machine I get:
1 2 3 4
0.100000 0.200000 0.300000 0.400000
0.000000 0.000000 0.000000 0.000000 <-- no no no
I’m not that assembler guru but I just see some byte movements here:
[...] 400454: f2 0f 10 1d 1c 02 00 movsd 0x21c(%rip),%xmm3 40045b: 00 40045c: bf 49 06 40 00 mov $0x400649,%edi 400461: f2 0f 10 15 17 02 00 movsd 0x217(%rip),%xmm2 400468: 00 400469: b8 04 00 00 00 mov $0x4,%eax 40046e: f2 0f 10 0d 12 02 00 movsd 0x212(%rip),%xmm1 400475: 00 400476: f2 0f 10 05 12 02 00 movsd 0x212(%rip),%xmm0 40047d: 00 40047e: 48 83 c4 08 add $0x8,%rsp 400482: e9 59 ff ff ff jmpq 4003e0
I suspect the vector equivalent of the scalar:
*( int * )&float_value = int_value;
How can you explain this behavior?
That’s what vector casts are defined to do (anything else would be completely bonkers, and would make standard vector programming idioms very painful to write). If you want to actually get a conversion, you’ll probably want to use an intrinsic of some sort, like _mm_cvtepi32_ps (this breaks the nice architectural independence of your vector code, of course, which is also annoying; a common approach is to use a translation header that defines a portable set of “intrinsics”).
Why is this useful? A variety of reasons, but here’s the biggest:
In vector code, you almost never want to branch. Instead, if you need to do something conditionally, you evaluate both sides of the condition, and use a mask to select the appropriate result lane by lane. These mask vectors “naturally” have integer type, whereas your data vectors are often floating-point; you want to combine the two using logical operations. This extremely common idiom is most natural if vector casts simply re-interpret the bits.
Granted, it’s possible to work around this case, or any of a bag of other common vector idioms, but the “vector is a bag of bits” view is extremely common, and reflects the way most vector programmers think.