首页> 外文会议>Conference on Visual Communications and Image Processing >Mathematical morphology-based shape feature analysis for Chinese character recognition systems
【24h】

Mathematical morphology-based shape feature analysis for Chinese character recognition systems

机译:基于数学形态学的汉字识别系统形状特征分析

获取原文
获取外文期刊封面目录资料

摘要

This paper proposes an efficient technique of shape feature extraction based on the application of mathematical morphology theory. A new shape complexity index for preclassification of machine printed Chinese Character Recognition (CCR) is also proposed. For characters represented in different fonts/sizes or in a low resolution environment, a more stable local feature such as shape structure is preferred for character recognition. Morphological valley extraction filters are applied to extract the protrusive strokes from four sides of an input Chinese character. The number of extracted local strokes reflects the shape complexity of each side. These shape features of characters are encoded as corresponding shape complexity indices. Based on the shape complexity index, data base is able to be classified into 16 groups prior to recognition procedures. The performance of associating with shape feature analysis reclaims several characters from misrecognized character sets and results in an average of 3.3% improvement of recognition rate from an existing recognition system. In addition to enhance the recognition performance, the extracted stroke information can be further analyzed and classified its own stroke type. Therefore, the combination of extracted strokes from each side provides a means for data base clustering based on radical or subword components. It is one of the best solutions for recognizing high complexity characters such as Chinese characters which are divided into more than 200 different categories and consist more than 13,000 characters.
机译:本文提出了一种基于数学形态学理论的应用形状特征提取技术。还提出了一种新的形状复杂性,用于预定机器印刷的汉字识别(CCR)。对于以不同的字体/大小或在低分辨率环境中表示的字符,对于字符识别,优选一种更稳定的本地特征,例如形状结构。应用形态谷提取过滤器用于从输入汉字的四个侧面提取突出的冲程。提取的局部笔划的数量反映了每侧的形状复杂性。这些形状的字符形状的特征被编码为相应的形状复杂度指数。基于形状复杂性指数,数据库能够在识别过程之前分为16组。关联与形状特征分析的性能回收来自误导性字符集的几个字符,并且从现有识别系统的识别率的提高3.3%的平均值。除了提高识别性能之外,还可以进一步分析提取的笔划信息并分类为自己的笔划类型。因此,来自每个侧的提取笔划的组合为基于基于自由基或序列组件提供了用于数据库聚类的装置。它是识别高复杂性字符(如汉字)分为200多个不同类别的最佳解决方案之一,并且由13,000个字符组成。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号