I have an equation of the type c = Ax + By where c, x and y are vectors of dimensions say 50,000 X 1, and A and B are matrices with dimensions 50,000 X 50,000.
Is there any way in Matlab to find matrices A and B when c, x and y are known?
I have about 100,000 samples of c, x, and y. A and B remain the same for all.
Let
Xbe the collection of all 100,000xs you got (such that thei-th column ofXequals thex_i-th vector).In the same manner we can define
YandCas 2D collections ofys andcs respectively.What you wish to solve is for
AandBsuch thatYou have 2 * 50,000^2 unknowns (all entries of
AandB) andnumel(C)equations.So, if the number of data vectors you have is 100,000 you have a single solution (up to linearly dependent samples). If you have more than 100,000 samples you may seek for a least-squares solution.
Re-writing:
So, I suppose
In matlab:
Correct me if I’m wrong…
EDIT:
It seems like there is quite a fuss around dimensionality here. So, I’ll try and make it as clear as possible.
Model: There are two (unknown) matrices
AandB, each of size 50,000×50,000 (total 5e9 unknowns).An observation is a triplet of vectors: (
x,y,c) each such vector has 50,000 elements (total of 150,000 observed points at each sample). The underlying model assumption is that an observation is generated byc = Ax + Byin this model.The task: given
nobservations (that isntriplets of vectors { (x_i,y_i,c_i) }_i=1..n) the task is to uncoverAandB.Now, each sample (
x_i,y_i,c_i) induces 50,000 equations of the formc_i = Ax_i + By_iin the unknownAandB. If the number of samplesnis greater than 100,000, then there are more than 50,000 * 100,000 ( > 5e9 ) equations and the system is over constraint.To write the system in a matrix form I proposed to stack all observations into matrices:
Xof size 50,000 xnwith itsi-th column equals to observedx_iYof size 50,000 xnwith itsi-th column equals to observedy_iCof size 50,000 xnwith itsi-th column equals to observedc_iWith these matrices we can write the model as:
C = A*X + B*Y
I hope this clears things up a bit.
Thank you @Dan and @woodchips for your interest and enlightening comments.
EDIT (2):
Submitting the following code to octave. In this example instead of 50,000 dimension I work with only 2, instead of
n=100,000observations I settled forn=100:Checking the difference between ground truth model (
AandB) and recoveredABt:Yields
Which is close enough to zero. (remember, the observations were noisy and solution is a least-square one).