I’m trying to calculate the (cross?) correlation between a template image (an image which is part of a bigger image), to the image which the template belongs to.
Suppose the template image is 3×2 and the large image is 20×20. What i did first is grayscaled both images. Then i got the mean of the gray values (again for both). After that i checked pixel by pixel if the current pixel is lower or higher than the mean. If it’s lower then i colored the pixel black, if it’s higher then the pixel will be white. So basically this leaves me with a binary image. Where 1==white and 0==black.
My template image binary value is: 101010
Then in the large image i start scanning each pixel to see if it matches the template. So i start at x=0, y=0 in the big image and i compare the first three pixel on the X axis from the first two rows on the Y axis with the ones of the template image. The binary value for that is: 111010
So the next step is to check the correlation, right? Now here’s the tricky part for me because i’m not sure if i’m doing it right. But this is what i’ve come up with:
101010 (template image)
Sum = 3
Mean = 0.5
Standard Deviation = 4,2
111010 (big image, first section)
Sum = 4
Mean = 0,66
Standard Deviation = 2,82
Then i tried to calculate the correlation like so:

Which got me the following result:
r = -0,04
Since this number isn’t close to 1 at all, this means there is no close correlation right?
Or maybe i have to compare it to n-2 = the critical value. So in this case, 6-2 = 4. Since it isn’t close to 4 either this also means that there is no correlation, right?
And what does it mean when it’s close to -1, does this mean that there is even less correlation?
And most important, are my calculation correct..? Or am i still missing something..??
I’ve had another look at your question. I think you’re confusing cross-correlation with convolution.
Basically, you use cross-correlation when you have two images of the same size, and you want to tell how similar you are. For example, you could use it on two subsequent frames in a video. Cross-correlation yields a single number.
You’d use convolution when you have a template image of a small size and you want to find its location in an image of a larger size (for this reason, it’s also commonly known as template matching). Convolution yields a new image. Each pixel (x, y) in this new image represents the strength of the match of the template to the original image location (x, y). Typically, after performing convolution, you look for the maximum value (or values) in the convolution result, and use that as the detected locations of the template.
Given the way you’ve worded your question, it sounds like convolution is indeed what you are after. In your cases, the images are of different sizes, so you can’t really calculate cross-correlation meaningfully.
Finally, after you’ve figured out how it all works, you should comfort yourself in the thought that most image processing libraries handle include this functionality. Everyone who’s done any decent image processing has implemented convolution at least once just for the fun of it, though 🙂