I’m using a version of openMP which does not support reduce() for complex argument. I need a fast dot-product function like
std::complex< double > dot_prod( std::complex< double > *v1,std::complex< double > *v2,int dim )
{
std::complex< double > sum=0.;
int i;
# pragma omp parallel shared(sum)
# pragma omp for
for (i=0; i<dim;i++ )
{
#pragma omp critical
{
sum+=std::conj<double>(v1[i])*v2[i];
}
}
return sum;
}
Obviously this code does not speed up the problem but slows it down. Do you have a fast solution without using reduce() for complex arguments?
Each thread can calculate the private sum as the first step and as the second step it can be composed to the final sum. In that case the critical section is only needed in the final step.