I implemented the following code that use the data points in “dat” to calculate the distance matrix between each point and all the other points “dist”. Then I use this distance matrix to find the K closest points to each point in the data “smallest”, then use this to find the sum of the K nearest neighbor.
The following algorithm is a parallel algorithm using OpenMP and it’s working very fine. I just need suggestions to make it run faster. Any suggestion is highly appreciated.
vector<vector<double> > dist(dat.size(), vector<double>(dat.size()));
size_t p,j;
ptrdiff_t i;
double* sumKnn = new double[dat.size()];
vector<vector<int > > smallest(dat.size(), vector<int>(k));
#pragma omp parallel for private(p,j,i) default(shared)
for(p=0;p<dat.size();++p)
{
int mycont=0;
for (j = 0; j < dat.size(); ++j)
{
double ecl = 0.0;
for (i = 0; i < c; ++i)
{
ecl += (dat[p][i] - dat[j][i]) * (dat[p][i] - dat[j][i]);
}
ecl = sqrt(ecl);
dist[p][j] = ecl;
//dist[j][p] = ecl;
int index=0;
if(mycont<k && j!=p)
{
smallest[p][mycont]=j;
mycont++;
}
else if(j!=p)
{
double max=0.0;
int index=0;
for(int i=0;i<smallest[p].size();i++)
{
if(max < dist[p][smallest[p][i]])
{
index=i;
max=dist[p][smallest[p][i]];
}
}
if(max>dist[p][j])
{
smallest[p].erase(smallest[p].begin()+index);
smallest[p].push_back(j);
}
}
}
double sum=0.0;
for(int r=0;r<k;r++)
sum+= dist[p][smallest[p][r]];
sumKnn[p]=sum;
}
This is more of a comment than an answer, but the comment box is too small, …
One of the useful aspects of OpenMP is that you can parallelise a serial program in steps. So your first step should be to write a serial code which solves your problem. When you’ve done that you could post again and ask for help on parallelising it.
To parallelise your program, find the outermost loop statement and think how distributing the loop iterations across threads will affect the calculations. I suspect that you’ll want to create a shared vector of close points as the loops go round, then sort it at the end on one thread only. Or perhaps not.