Does anybody knows about post-processing algorithms to remove ghost objects from binarized image? The problem:
When I binarize image using for example niblack method or bernsen,
it produces many noise. I red book or internet articles about binarization, and they all say that the post-processing step is needed in Niblack and other’s binarization method,
But they don’t say what is it, post-processing operation. So please, if someone knows, tel me.
EDIT:
Original image:
alt text http://i.piccy.info/i4/20/63/b970ab2ca66e997f421e969a1657.bmp
Bernsen threshold winsize 31, contrast difference 15:
alt text http://i.piccy.info/i4/32/55/2f1e0293311119986bd49529e579.bmp
Bernsen threshold winsize 31, contrast difference 31:
alt text http://i.piccy.info/i4/2a/13/774508890030b93201458986bbd2.bmp
Niblack method window size-15, k_value 0.2:
alt text http://i.piccy.info/i4/12/4f/fa6fc09bcba7a7e3245d670cbfa5.bmp
Niblack method window size-31, k_value 0.2:
alt text http://i.piccy.info/i4/c0/fd/1f190077abba2aeea89398358fc0.bmp
EDIT2:
As you see, the Niblack threshold is making many noise.
And if I make the window size less, the black squares became a little white inside.
The Bernsen is better – less noise, but even if I make the contrast difference bigger,
but there is one problem, I just can’t produce image right now, in words, the problem:
if image contains some objects with color close to white color, and the background is white,
so if there is a region (for examle line) with black color, then this method ignores the objects and result is wrong.
That is because Bernsen method use this formula:
at each pixel calculate the contrast difference
diff = maximum_grayscale_value – minimum_grayscale_value
and then the diff is used to calculate threshold value,
but in the case that I wrote above, we have maximum value of 255
and minimum value of 0.
So threshold will be 128,
But actual object color is above the 128 (near white color).
So I need to use some post-processing operations to make binarization correctly.
Any thoughts?
Complete Python program using K-means, a tool meant for finding optimal quantization intervals:
During each iteration, K-means computes the center of each “cluster” then reassigns elements to clusters based upon the recomputed centers. In the simple case where each element (i.e., pixel) is one-dimensional, and only two clusters are required, the threshold is simply the average of the two cluster centers.
I should add that this method works for the example image you posted, but may not for others (such as the one you posted in another question). But without further information, I think that this solution works.
Output:
binary.bmp http://up.stevetjoa.com/binary.bmp