I realize that reduction is only usable for POD types in C++. What would you do to implement a reduction for a complex type accumulator?
complex<double> x(0.0,0.0), y(1.0,1.0);
#pragma omp parallel for reduction(+:x)
for(int i=0; i<5; i++)
{
x += y;
}
(noting that I may have left some syntax out). It seems an obvious solution would be to split real and imaginary components into temporary doubles, then accumulate on those. I guess I’m looking for elegance, and that seems … less than pretty. Would that be the typical approach here?
The typical workaround in absence of user-defined reductions in OpenMP is even uglier than what you suggested. Usually, prior to the parallel region people create an array of (at least) as many elements as there will be threads in the region, accumulate partial results separately for each thread using
omp_get_thread_num()as an index to the array, and do final reduction of the accumulated results in a loop after the parallel region.As far as I know, OpenMP language committee works on adding user-defined reductions to the specification, so maybe it will be finally resolved in a few years.