Aiming at the problem that the current image recognition method is limited to the same characters in the sample, this paper presents a method of recognizing the document based on character image segmentation.The k-means algorithm is used to segment the character images, and the local binary pattern texture features are extracted for different regions, thus eliminating the influence of the character structure on the recognition results.To study the recognition of single region feature set and combination feature set, the results show that the proposed method can obtain high recognition accuracy without same characters in the sample.%针对目前的打印文件识别方法受限于样本中必须有相同字符的问题,提出一种基于字符图像分割的打印文件识别方法.通过k-means算法对字符图像进行分割,分别对不同区域提取局部二值模式纹理特征,从而消除字符结构对识别结果的影响.研究了单一区域的特征集和组合特征集的分类识别效果,实验结果表明,该方法在样本中无相同字符的情况下,能够得到较高的识别准确率.
展开▼