I’ve got AVX C++ code like this, that compiles fine under Visual Studio 2010:
#include <immintrin.h>
#include <iostream>
int main() {
float data[] = {0, 1, 2, 3, 4, 5, 6, 7};
__m256 ymm0 = _mm256_loadu_ps(data);
// ..
float r0 = ymm0.m256_f32[0];
float r4 = ymm0.m256_f32[4];
std::cout << r0 << " " << r4 << std::endl;
}
GCC however, gives the following error:
foo.cpp:8:18: error: request for member ‘m256_f32’ in ‘ymm0’, which is of non-class type ‘__m256 {aka __vector(8) float}’
foo.cpp:9:18: error: request for member ‘m256_f32’ in ‘ymm0’, which is of non-class type ‘__m256 {aka __vector(8) float}’
I’ve done some research and it seems that ymm0.m256_f32 is a Microsoft specific instruction to extract the individual floats from a long AVX register. But what can I use for gcc/linux to do the same thing?
GCC can index vectors in the C language, but not in C++. You may consider rewriting small parts of code as C.
The other option is to explicitly use shuffle, extract and conversion intrinsics –
_mm256_shuffle_pd,_mm256_extractf128_pd,_mm_cvtsd_f64.