首页> 外文会议>International Congress on Image and Signal Processing, BioMedical Engineering and Informatics >Segmentation-Free Multi-Font Printed Manchu Word Recognition Using Deep Convolutional Features and Data Augmentation
【24h】

Segmentation-Free Multi-Font Printed Manchu Word Recognition Using Deep Convolutional Features and Data Augmentation

机译:使用深度卷积特征和数据增强的分割 - 无字体打印满族字识别

获取原文

摘要

Precise Manchu character segmentation of segmentation-based Manchu recognition methods is difficult to realize because of complex Manchu language spelling rules and existence multi Manchu fonts. To avoid the influence of incorrect segmentation, this work proposes the idea of segmentation-free recognition to recognize Manchu word instead of Manchu characters. In addition, an end-to-end 9-layer convolutional neural network is proposed to automatically extract deep hierarchy features on Manchu word image. The proposed recognition model is applied on Manchu words with 12 Manchu fonts to evaluate its ability of multi-font recognition. Deep neural network needs massive data for training, whereas Manchu language is an endangered language lacking in document data. To solve this contradiction, this work firstly builds a Manchu dataset prototype and a multi-font Manchu word testing set, and then designs a data augmentation system to generate synthetic data for training. The data augmentation system contains 7 generation methods, including character structure distortion and image quality transformation. Experiments demonstrate the proposed convolutional neural network for Manchu word recognition achieves a new state-of-the-art accuracy on multi-font printed Manchu word. For printed Manchu fonts, the highest recognition accuracy reaches 0.95; the lowest accuracy is 0.88; the average accuracy of printed Manchu fonts reaches 0.91. Experiments also demonstrate the proposed data augmentation system is an effective way to solve insufficient data problem.
机译:由于复杂的满族语言拼写规则和存在多满人们字体,因此难以实现基于分割的满族识别方法的精确满族人物分割。为避免分割不正确的影响,这项工作提出了无分割识别的想法,以识别满族字而不是满族字符。另外,提出了端到端的9层卷积神经网络以在满族字图像上自动提取深层次的特征。建议的识别模型应用于满族单词,具有12个满族字体,以评估其多字体识别的能力。深度神经网络需要大量的培训数据,而人物语言是一种缺乏文档数据的濒危语言。为了解决这一矛盾,这项工作首先建立了一个满族数据集原型和多字体满族语言测试集,然后设计数据增强系统以生成培训的合成数据。数据增强系统包含7个代方法,包括字符结构失真和图像质量转换。实验证明了满足满族字识别的卷积神经网络,实现了在多字体印刷的满族字上实现了新的最先进的准确性。对于印刷的人群字体,最高识别精度达到0.95;最低精度为0.88;印刷的满族字体的平均准确性达到0.91。实验还展示了所提出的数据增强系统是解决数据问题不足的有效方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号