Below is a portion of code parallelized via openMP. Arrays, ap[] and sc[], are emposed to addition assignment so, I decided to make them shared and then put them in critical clause section since reduction clause does not accept arrays. But it gives a different result than its serial counterpart. Where is the problem?
Vector PN, Pf, Nf; // Vector is user-defined structure
Vector NNp, PPp;
Vector gradFu, gradFv, gradFw;
float dynVis_eff, SGSf;
float Xf_U, Xf_H;
float mf_P, mf_N;
float an_diff, an_conv_P, an_conv_N, an_trans;
float sc_cd, sc_pres, sc_trans, sc_SGS, sc_conv_P, sc_conv_N;
float ap_trans;
#pragma omp parallel for
for (int e=0; e<nElm; ++e)
{
ap[e] = 0.f;
sc[e] = 0.f;
}
#pragma omp parallel for shared(ap,sc)
for (int f=0; f<nFaces; ++f)
{
PN = cntE[face_N[f]] - cntE[face_P[f]];
Pf = cntF[f] - cntE[face_P[f]];
Nf = cntF[f] - cntE[face_N[f]];
PPp = Pf - (Pf|norm(PN))*norm(PN);
NNp = Nf - (Nf|norm(PN))*norm(PN);
mf_P = mf[f];
mf_N = -mf[f];
SGSf = (1.f-ifac[f]) * SGSvis[face_P[f]]
+ ifac[f] * SGSvis[face_N[f]];
dynVis_eff = dynVis + SGSf;
an_diff = dynVis_eff * Ad[f] / mag(PN);
an_conv_P = -neg(mf_P);
an_conv_N = -neg(mf_N);
an_P[f] = an_diff + an_conv_P;
an_N[f] = an_diff + an_conv_N;
// cross-diffusion
sc_cd = an_diff * ( (gradVel[face_N[f]]|NNp) - (gradVel[face_P[f]]|PPp) );
#pragma omp critical
{
ap[face_P[f]] += an_N[f];
ap[face_N[f]] += an_P[f];
sc[face_P[f]] += sc_cd + sc_conv_P;
sc[face_N[f]] += -sc_cd + sc_conv_N;
}
You have not declared whether all the other variables in your parallel clause should be shared or not. You can do this generically with the
defaultclause. If no default is specified, the variables are all shared, which is causing the problems in your code.In your case, I’m guessing you should go for
I strongly recommend always using
default(none)so that the compiler complains every time you don’t declare a variable explicitly and forces you to think about it explicitly.