I’m just getting started experimenting adding OpenMP to some SSE code. My first test

Question

0

Editorial Team

Asked: May 23, 20262026-05-23T19:51:19+00:00 2026-05-23T19:51:19+00:00

I’m just getting started experimenting adding OpenMP to some SSE code. My first test

0

I’m just getting started experimenting adding OpenMP to some SSE code.

My first test program SOMETIMES crashes in _mm_set_ps, but works when I set the if (0).

It looks so simple I must be missing something obvious.
I’m compiling with gcc -fopenmp -g -march=core2 -pthreads

  #include <stdio.h>
  #include <stdlib.h>
  #include <immintrin.h>

  int main()
  {
  #pragma omp parallel if (1)
   {
  #pragma omp sections
       {
  #pragma omp section
           {
              __m128 x1 = _mm_set_ps ( 1.1f, 2.1f, 3.1f, 4.1f );
           }
  #pragma omp section
           {
              __m128 x2 = _mm_set_ps ( 1.2f, 2.2f, 3.2f, 4.2f );
           }
       } // end omp sections
   } //end omp parallel

  return 0;
  }

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-23T19:51:20+00:00

This is a bug in the openMP implementation. I was having the same problem in gcc on Windows (MinGW). -mstackrealign command line option solved my problem. This adds an instruction to the prolog of every function to realign the stack at the 16-byte boundary. I didn’t notice any performance penalty. You can also try to add __attribute__ ((force_align_arg_pointer)) to a function declaration, which should do the same, but only for a specific function. You might have to put the SSE code in a separate function that you then call from the function with #pragma omp, so that the stack has a chance to be realigned.

I stopped having the problem when I moved onto compiling for a 64-bit target (MinGW64, such as TDM GCC build).

I am playing with AVX instructions which require a 32-byte alignment, but GCC doesn’t support that for windows at all. This forced me to fix the produced assembly code using a python script, but it works.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’m just getting started experimenting adding OpenMP to some SSE code. My first test

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply