I am trying to use SSE instruction in Windows form application in VS 2010. And I am using sum_array function in my application given at the following link
SSE instructions to add all elements of an array
But when I am compiling the application it is giving the following error
error C3645: 'plot_rectangle::Form1::sum_array' : __clrcall cannot be used on functions compiled to native code
As I am also using OpenCV functions in my application so I have to choose /clr compiler option for that.
So what is the solution for that error when we are using SSE with OpenCV.
I have also tried that SSE instruction in between pragma like
#pragma managed(push, off)
uint32_t sum_array(const uint8_t a[], int n)
{
const __m128i vk0 = _mm_set1_epi8(0); // constant vector of all 0s for use with _mm_unpacklo_epi8/_mm_unpackhi_epi8
const __m128i vk1 = _mm_set1_epi16(1); // constant vector of all 1s for use with _mm_madd_epi16
__m128i vsum = _mm_set1_epi32(0); // initialise vector of four partial 32 bit sums
uint32_t sum;
int i;
for (i = 0; i < n; i += 16)
{
__m128i v = _mm_load_si128((const __m128i *)&a[i]); // load vector of 8 bit values
__m128i vl = _mm_unpacklo_epi8(v, vk0); // unpack to two vectors of 16 bit values
__m128i vh = _mm_unpackhi_epi8(v, vk0);
vsum = _mm_add_epi32(vsum, _mm_madd_epi16(vl, vk1));
vsum = _mm_add_epi32(vsum, _mm_madd_epi16(vh, vk1));
// unpack and accumulate 16 bit values to
// 32 bit partial sum vector
}
// horizontal add of four 32 bit partial sums and return result
vsum = _mm_add_epi32(vsum, _mm_srli_si128(vsum, 8));
vsum = _mm_add_epi32(vsum, _mm_srli_si128(vsum, 4));
sum = _mm_cvtsi128_si32(vsum);
return sum;
}
#pragma managed(pop)
But getting same error.
Can any body please help me sort out this problem.
You cannot use inline assembly or SSE intrinsics in code that gets compiled to IL. The workaround is simple, write it in a separate helper function that you bracket with #pragma managed, like this:
And call that function from your Form1::sum_array() method.