I’m new to image processing, but I’m using EMGU for C# image analysis. However, I know the homography matrix isn’t unique to EMGU, and so perhaps someone with knowledge of another language can explain better.
Please (in as simplified as can be) can someone explain what each element does. I’ve looked this up online but can’t find an answer that I can properly understand (as I said, I’m kinda new to all this!)
I analyse 2 images, both 2 dimensional. Therefore a 3×3 matrix is needed to account for the rotation / translation of the image. If no movement is detected, the homography matrix is:
100,
010,
001
I know from research (eg OpenCV Homography, Transform a point, what is this code doing?) that:
10Tx,
01Ty,
XXX
The 10,01 bit is the rotation of the x and y coordinates. The Tx and Ty bits are the translational movement, but what is the XXX bit? This is what I don’t understand? Is it something to do with affine transformations? Please can someone explain:
1. If I’m currently right in what I say above.
2. what the XXX bit means
It’s not that difficult to understand if you have a grasp of matrix multiplication. Assume you point
xisand you want to rotate the coordinate system by
A:and and “move it” it by
tThe latter matrices are the components of the affine transformation to get the new point
y:As you know, to get that, one can construct a 3d matrix
Band a vectorx'looking likesuch that
from which you can extract
y. Let’s see how that works. In the original transformation (using addition), the first step would be to carry out the multiplication, ie. the rotating party_r:then you add the “absolute” part:
Now look at how
Bworks. I’ll calculatey'row by row:Just what we expected. First, the rotation part gets calculated–addition and multiplication. Then, the x-part of the translational part gets added, multiplied by
1–it stays the same. The same thing for the second row.In the third row,
aandbare dropped (multiplied by0). The last part is kept the same, and happens to be1. So, all about that last line is to “drop” the values of the point and keep the1.It could be argued, then, that a 2×3 matrix would be enough for that. That’s partially true, but has one significant disadvantage: you loose composability. Suppose you are basically satisfied with
B, but want to mirror one coordinate. Then you can choose another transformation matrixand have a result
This simple multiplication could not be done that easily with 2×3 matrices, simply because of the properties of matrix multiplication.
In principle, in the above, the last row (the
XXX) could also be anything else of the form<0;0;x>. It was there just to drop the point values. It is however necessary exactly like this to make composition by multiplication work.Finally, wikipedia seems quite informative to me in this case.