首页> 外文会议>International Conference on Document Analysis and Recognition >Multi-font printed Chinese character recognition using multi-pooling convolutional neural network
【24h】

Multi-font printed Chinese character recognition using multi-pooling convolutional neural network

机译:多池卷积神经网络的多字体印刷汉字识别

获取原文

摘要

Although previous studies have achieved effective printed Chinese character recognition (PCCR) in the case a single font or a few different fonts, large scale multi-font PCCR remains a major challenge owing to the wide variety in the shape, layout, and grey-level distribution of single Chinese characters across different font styles. This paper applies multi-pooling and data augmentation with non-linear transformation to a convolutional neural network (CNN) for multi-font PCCR. We propose a multi-pooling layer on top of the final convolutional layer; this approach is found to be robust to spatial layout variations and deformations in multi-font printed Chinese characters. Experimental results show that multi-pooling significantly improves CNN performance. In addition, we adopt a distorted sample generation technique by applying non-linear warping functions along an original font image, which distorts the local density of image-based Chinese character strokes. We find that CNN performance is further boosted by the distorted samples technique. An input character image is transformed into four distorted images and the CNN learns the original image as well as the distorted samples to classify 3755 classes (level-1 set of GB2312-80) of printed Chinese characters in 280 widely varying fonts and 120 manually selected fonts. Outstanding recognition rates of 94.38% and 99.74% are achieved in the former and latter cases, respectively, which indicates the effectiveness of the proposed methods.
机译:尽管先前的研究已经在单个字体或几种不同字体的情况下实现了有效的印刷汉字识别(PCCR),但是由于形状,布局和灰度级的多样性,大型多字体PCCR仍然是一个主要挑战。单个汉字在不同字体样式中的分布。本文将非线性变换的多池和数据增强技术应用于多字体PCCR的卷积神经网络(CNN)。我们在最后的卷积层之上提出了一个多池层。发现该方法对于多字体印刷汉字的空间布局变化和变形是鲁棒的。实验结果表明,多池技术可以显着提高CNN的性能。此外,我们通过在原始字体图像上应用非线性变形函数来采用失真样本生成技术,从而扭曲了基于图像的汉字笔划的局部密度。我们发现,失真样本技术进一步提高了CNN的性能。输入的字符图像被转换为​​四个失真的图像,并且CNN会学习原始图像以及失真的样本,以便以280种广泛变化的字体和120种手动选择的字体将3755类(GB2312-80的第1级集)分类为打印的汉字。字体。在前一种情况和后一种情况下,分别达到了94.38%和99.74%的出色识别率,这表明了所提方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号