I would appreciate an extra set of eyes on this code. It’s supposed to find a best-fit ellipse for a set of data points. The problem is that the length of major & minor axis (aDist and bDist) are coming out larger than they should.
Inputs:
- points – a set of (x,y) coordinates for the data points; x and y are non-negative
- avgX, avgY – the average of the x and y coordinates of all data points
Outputs:
- aDist, bDist – the lengths of the major and minor axes
// Find a and b -- use principal component analysis
// http://ask.metafilter.com/36213/Best-Fit-Ellipse (2nd reply)
// http://number-none.com/product/My%20Friend,%20the%20Covariance%20Body/index.html
double mat[2][2]; // Will be the covariance matrix.
// Eigenvectors will be major & minor axes. Eigenvalues will be lengths of axes, squared.
mat[0][0] = mat[0][1] = mat[1][0] = mat[1][1] = 0;
for (CPixelList::iterator i = points->begin(); i != points->end(); i++)
{
// Add [ x - avgX, y - avgY ] * [ x - avgX ] to mat
// [ y - avgY ]
double diffX = i->x - avgX;
double diffY = i->y - avgY;
mat[0][0] += diffX * diffX;
mat[0][1] += diffX * diffY;
mat[1][1] += diffY * diffY;
}
mat[1][0] = mat[0][1];
// http://www.math.harvard.edu/archive/21b_fall_04/exhibits/2dmatrices/index.html
double T = mat[0][0] + mat[1][1]; // Trace
double D = mat[0][0] * mat[1][1] - mat[0][1] * mat[1][0]; // Determinant
double L1 = T/2 + sqrt(T*T/4 - D); // Eigenvalues
double L2 = T/2 - sqrt(T*T/4 - D); //
aDist = sqrt(L1);
bDist = sqrt(L2);
I have checked the inputs in the debugger, and they look OK. I have tried this code for some simple shapes (circles, ellipses, rectangles) with no rotation, and aDist and bDist are proportional to the shape but always too large. For example, if ‘points’ is a 100×100 circle, then aDist and bDist are 582.
Update: After summing up mat, I now divide each element by points->size(), as Mike suggested. If points is the square <(0,0),(10,0),(10,10),(0,10)>, then aDist and bDist are now 5, as expected which is too small. As more pixels are added to that square, aDist and bDist get smaller. For example, <(0,0),(5,0),(10,0),(10,5),(10,10),(5,10),(0,10),(0,5)> gives a radius of sqrt(18.75)=4.33.
You need to divide
matby the total number of points to get the correct covariance matrix.