The below test case runs out of memory on 32 bit machines (throwing std::bad_alloc) in the loop following the “post MT section” message when OpenMP is used, however, if the #pragmas for OpenMP are commented out, the code runs through to completion fine, so it appears that when the memory is allocated in parallel threads, it does not free correctly and thus we run out of memory.
Question is whether there is something wrong with the memory allocation and deletion code below or is this a bug in gcc v4.2.2 or OpenMP? I also tried gcc v4.3 and got same failure.
int main(int argc, char** argv)
{
std::cout << "start " << std::endl;
{
std::vector<std::vector<int*> > nts(100);
#pragma omp parallel
{
#pragma omp for
for(int begin = 0; begin < int(nts.size()); ++begin) {
for(int i = 0; i < 1000000; ++i) {
nts[begin].push_back(new int(5));
}
}
}
std::cout << " pre delete " << std::endl;
for(int begin = 0; begin < int(nts.size()); ++begin) {
for(int j = 0; j < nts[begin].size(); ++j) {
delete nts[begin][j];
}
}
}
std::cout << "post MT section" << std::endl;
{
std::vector<std::vector<int*> > nts(100);
int begin, i;
try {
for(begin = 0; begin < int(nts.size()); ++begin) {
for(i = 0; i < 2000000; ++i) {
nts[begin].push_back(new int(5));
}
}
} catch (std::bad_alloc &e) {
std::cout << e.what() << std::endl;
std::cout << "begin: " << begin << " i: " << i << std::endl;
throw;
}
std::cout << "pre delete 1" << std::endl;
for(int begin = 0; begin < int(nts.size()); ++begin) {
for(int j = 0; j < nts[begin].size(); ++j) {
delete nts[begin][j];
}
}
}
std::cout << "end of prog" << std::endl;
char c;
std::cin >> c;
return 0;
}
I found this issue elsewhere seen without OpenMP but just using pthreads. The extra memory consumption when multi-threaded appears to be typical behavior for the standard memory allocator. By switching to the Hoard allocator the extra memory consumption goes away.