I have the following code which works on the compilers I have available (xlC and gcc) but I don’t know if it is fully compliant (I didn’t find anything in the OpenMP 3.0 spec that explicitly disallows it):
#include <iostream>
#include <vector>
#include <omp.h>
struct A {
int tid;
A() : tid(-1) { }
A(const A&) { tid = omp_get_thread_num(); }
};
int main() {
A a;
std::vector<int> v(10);
std::vector<int>::iterator it;
#pragma omp parallel for firstprivate(a)
for (it=v.begin(); it<v.end(); ++it)
*it += a.tid;
for (it=v.begin(); it<v.end(); ++it)
std::cout << *it << ' ';
std::cout << std::endl;
return 0;
}
My motivation is to figure out how many threads and each thread’s id in the omp parallel for
section (I do not wish to call it for each element that is being processed though). Is there any chance that I’m causing undefined behavior?
I would just decouple (start of) the parallel region from the loop, and use private variable to keep tid:
Added: below are the quotes from the OpenMP specification (Section 2.9.3.4) that make me think your code is conformant and so does not produce UB (however see another addition below):
Added-2: However, it is not specified which thread executes the copy constructor for a
firstprivatevariable. So in theory, it can be done by the master thread of the region for all copies of the variable. In this case, the value ofomp_get_thread_num()will be equal in all copies, either 0 or, in case of nested parallel regions, the thread number in the outer region. So, being a defined behavior from OpenMP standpoint, it may result in a data race in your program.