I implemented a simple matrix vector multiplication for sparse matrices in CRS using an implicit openMP directive in the multiplication loop.
The complete code is in GitHub: https://github.com/torbjoernk/openMP-Examples/blob/icc_gcc_problem/matxvec_sparse/matxvec_sparse.cpp
Note: It’s ugly 😉
To control the private and shared memory I’m using restrict pointers. Compiling it with GCC 4.6.3 on 64bit Linux works fine (besides two warnings about %u and unsigned int in a printf command, but that’s not the point).
However, compiling it with ICC 12.1.0 on 64bit Linux failes with the error:
matxvec_sparse.cpp(79): error: "default_n_row" must be specified in a variable list at enclosing OpenMP parallel pragma
#pragma omp parallel \
^
with the definition of the variable and pointer in question
int default_n_row = 4;
int *n_row = &default_n_row;
and the openMP directive defined as
#pragma omp parallel \
default(none) \
shared(n_row, aval, acolind, arowpt, vval, yval) \
private(x, y)
{
#pragma omp for \
schedule(static)
for ( x = 0; x < *n_row; x++ ) {
yval[x] = 0;
for ( y = arowpt[x]; y < arowpt[x+1]; y++ ) {
yval[x] += aval[y] * vval[ acolind[y] ];
}
}
} /* end PARALLEL */
Compiled with g++:
c++ -fopenmp -O0 -g -std=c++0x -Wall -o matxvec_sparse matxvec_sparse.cpp
Compiled with icc:
icc -openmp -O0 -g -std=c++0x -Wall -restrict -o matxvec_sparse matxvec_sparse.cpp
- Is it an error in usage of GCC/ICC?
- Is this a design issue in my code causing undefined behaviour?
If so, which line(s) is/are causing it? - Is it just inconsistency between ICC and GCC?
If so, what would be a good way to achieve compiler independence and compatibility?
Huh. Looking at the code, it’s clear what icpc thinks the problem is, but I’m not sure without going through the specification which compiler is doing the right thing here, g++ or icpc.
The issue isn’t the
restrictkeyword; if you take all those out and lose the-restrictoption to icpc, the problem remains. The issue is that you’ve got in that parallel sectiondefault(none) shared(n_row...), butn_rowis, at the start of the program, a pointer todefault_n_row. And icpc is requiring thatdefault_n_rowalso be shared (or, at least, something) in that omp parallel section.