I have a project where I am required to subtract an empty template image from an incoming user filled image. The document type is a normal Bank cheque.
The aim is to extract the handwritten fields from it by subtracting one image from the empty template image.
The issue what i am facing is in aligning these two images, as there is scaling, translation, rotation etc
Any ideas on how to align the template image with the incoming image?
UPDATE 1:
I am posting an example image from the wikipedia page but in the monochrome format as my image is in monochrome format.

The basic answer is write a function that takes two images and a 2D transform and tells you how aligned they are once you apply the transform to the target image. The function needs to be continuous based on the transform and have a local minima (0) where the images are aligned perfectly. This is called a cost function.
Then use any optimization algorithm over the function and inputs — you are trying to optimize the transform (translation, scale, rotation). Examples are hill climbing, genetic, simulated annealing, etc.
There are products that do this — usually they are called Forms Recognition, Forms Registration, Forms Processing, etc. Some are SDKs, but there are also applications that can do it without programming.
Disclaimer: I work at Atalasoft, where we sell a Forms Processing add-on to our .NET imaging SDK.