We propose a new high-speed, high-accuracy binarization method for recognizing text in document images. First character neighborhoods are extracted from input images using a global thresholding value that is shifted to the background pixel value from the thresholding value of conventional global binarization. Second, characters are extracted using an original local binarization process integrated with image interpolation. Our method takes only 1/100 the processing time over the method that performs image interpolation first. Therefore our method binarizes an A4 size text image (150dpi) in an average of only 3.3 seconds using a 166 MHz Pentium processor. Furthermore, our method reduced unrecognized characters by 46.5%, compared with conventional global binarization.
展开▼