I have the matrix system:
A x B = C
A is a by n and B is n by b. Both A and B are unknown but I have partial information about C (I have some values in it but not all) and n is picked to be small enough that the system is expected to be over constrained. It is not required that all rows in A or columns in B are over constrained.
I’m looking for something like least squares linear regression to find a best fit for this system (Note: I known there will not be a single unique solution but all I want is one of the best solutions)
To make a concrete example; all the a’s and b’s are unknown, all the c’s are known, and the ?’s are ignored. I want to find a least squares solution only taking into account the know c’s.
[ a11, a12 ] [ c11, c12, c13, c14, ? ]
[ a21, a22 ] [ b11, b12, b13, b14, b15] [ c21, c22, c23, c24, c25 ]
[ a31, a32 ] x [ b21, b22, b23, b24, b25] = C ~= [ c31, c32, c33, ?, c35 ]
[ a41, a42 ] [ ?, ?, c43, c44, c45 ]
[ a51, a52 ] [ c51, c52, c53, c54, c55 ]
Note that if B is trimmed to b11 and b21 only and the unknown row 4 chomped out, then this is almost a standard least squares linear regression problem.
I have no idea on how to deal with your missing values, so I’m going to ignore that problem.
There are no unique solutions. To find a best solution you need some sort of a metric to judge them by. I’m going to suppose you want to use a least squares metric, i.e. the best guess values of A and B are those that minimize sum of the numbers [C_ij-(A B)_ij]^2.
One thing you didn’t mention is how to determine the value you are going to use for n. In short, we can come up with ‘good’ solutions if 1 <= n <= b. This is because 1 <= rank(span(C)) <= b. Where rank(span(C)) = the dimension of the column space of C. Note that this is assuming a >= b. To be more correct we would write 1 <= rank(span(C)) <= min(a,b).
Now, supposing that you have chosen n such that 1 <= n <= b. You are going to minimize the residual sum of squares if you chose the columns of A such that span(A) = span(First n eigen vectors of C). If you don’t have any other good reasons, just choose the columns of A to be to first n eigen vectors of C. Once you have chosen A, you can get the values of B in the usual linear regression way. I.e. B = (A’A)^(-1)A’ C