首页> 外文期刊>Image and Vision Computing >Chinese text distinction and font identification by recognizing most frequently used characters
【24h】

Chinese text distinction and font identification by recognizing most frequently used characters

机译:通过识别最常用的字符来区分中文和识别字体

获取原文
获取原文并翻译 | 示例
           

摘要

In this study, the method of implementing the three functions that can offer great help for a traditional OCCR (Optical Chinese Character Recognition) system is proposed : (1) to identify the font used in a document; (2) to detect and recognize the most frequently used (MFU) characters; and (3) to distinguish between the machine-printed and hand-written characters. According to the study investigated by Chang and Chen (Proceedings of the ICCC, 1994, pp. 310--316), about 20/100 of Chinese characters in a text document are predominated by the top- 40 MFU characters. If those MFU characters in a text document can be detected before adopting the traditional OCCR method, there will be great savings in computation time. The proposed method for character detection consists of the following three stages : the stage of segmentation, the stage of feature extraction, and the stage of classification. In the first stage, based on the concept of projection profile, the method presented by Wang et al. (Pattern Recognition 30 (1997) 1213) is utilized to segment characters individually from the input text document. In the second stage, three different types of features are introduced, including the density of black pixels, the projection profile code, and the modified skeleton template. These features are used to check whether the segmented character is semi-matched or fully-matched with the MFU template. Finally, in the last stage, based on the matching result, three different algorithms for implementing the aforementioned functions are provided. Experimental results are given in this study to demonstrate the practicality and superiority of the proposed method.
机译:在这项研究中,提出了可以为传统的OCCR(光学汉字识别)系统提供很大帮助的三种功能的实现方法:(1)识别文档中使用的字体; (2)检测和识别最常用的(MFU)字符; (3)区分机器印刷字符和手写字符。根据Chang和Chen进行的研究(ICCC会议论文集,1994年,第310--316页),文本文档中约20/100的汉字由前40个MFU字符主导。如果在采用传统的OCCR方法之前可以检测到文本文件中的那些MFU字符,那么将大大节省计算时间。所提出的字符检测方法包括以下三个阶段:分割阶段,特征提取阶段和分类阶段。在第一阶段,基于投影轮廓的概念,Wang等人提出的方法。 (模式识别30(1997)1213)用于从输入文本文档中单独地分割字符。在第二阶段,引入了三种不同类型的特征,包括黑色像素的密度,投影轮廓代码和修改后的骨架模板。这些功能用于检查分段字符与MFU模板是半匹配还是完全匹配。最后,在最后阶段,基于匹配结果,提供了用于实现上述功能的三种不同算法。实验结果表明该方法的实用性和优越性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号