An optimization for binarization methods by removing binary artifacts
Raúl Rojas, Marte Ramirez-Ortegon, Volker Märgner, Erik Cuevas – 2013
In this article, we introduce a novel technique to remove binary artifacts. Given a gray-intensity image and its corresponding binary image, our method detects and remove connected components that are more likely to be background pixels. With this aim, our method constructs an auxiliary image by the min-imum-error-rate threshold and, then, computes the ratio of intersection between the connected compo-nents of the original binary image and the connected components of the auxiliary image. Connected components with high ratio are considered true connected components while the rest are removed from the output. We tested our method in binarization methods for historical documents (handwritten and printed). Our results are favorable and indicate that our method can improve the outputs from diverse binarization methods. In particular, a high improvement was observed for printed documents. Our method is easy to implement, has a moderate computational cost, and has two parameters whose model interpretation allows an easy empirical selection.