Something like this: _declspec(align(16)) float dens[4]; //Here the code comes. F32vec4 S_START, Pos, _Vector

Question

0

Asked: June 6, 20262026-06-06T12:45:13+00:00 2026-06-06T12:45:13+00:00

Something like this: _declspec(align(16)) float dens[4]; //Here the code comes. F32vec4 S_START, Pos, _Vector

0

Something like this:

_declspec(align(16)) float dens[4];

//Here the code comes. F32vec4 S_START, Pos, _Vector

*((__m128*)dens) = (S_START - Pos) *_Vector;

float steps = max(max(dens[3], dens[2]), max(dens[1], dens[0]));

How do I do this directly using SSE?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-06-06T12:45:14+00:00

There’s no easy way to do this. SSE isn’t particularly meant for horizontal operations. So you have to shuffle…

Here’s one approach:

__m128 a = _mm_set_ps(10,9,7,8);

__m128 b = _mm_shuffle_ps(a,a,78);  //  {a,b,c,d} -> {c,d,a,b}
a = _mm_max_ps(a,b);

b = _mm_shuffle_ps(a,a,177);        //  {a,b,c,d} -> {b,a,d,c}
a = _mm_max_ss(a,b);

float out;
_mm_store_ss(&out,a);

I note that the final store isn’t really supposed to be a store. It’s just a hack to get the value into the float datatype.

In reality no instruction is needed because float types will be stored in the same SSE registers. (It’s just that the top 3 values are ignored.)

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

Something like this: _declspec(align(16)) float dens[4]; //Here the code comes. F32vec4 S_START, Pos, _Vector

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply