首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Segmentation-Free Printed Traditional Mongolian OCR Using Sequence to Sequence with Attention Model
【24h】

Segmentation-Free Printed Traditional Mongolian OCR Using Sequence to Sequence with Attention Model

机译:使用序列进行分割印刷的传统蒙古OCR与注意模型

获取原文
获取外文期刊封面目录资料

摘要

Mongolian Optical Character Recognition (OCR) systems are required for printed document digitization and Mongolian cultural resources utilization. Existing Mongolian OCR systems are based on segmentation. But, the Mongolian segmentation is more difficult than other languages. So, these methods are highly costly and error suffering. In this study, a segmentation-free based traditional Mongolian word recognition method is proposed. Specifically, we formalize the OCR task as a sequence to sequence mapping problem, in which the input Mongolian word image and the output textual string are treated as a sequence of image frames and a sequence of letters, respectively. A sequence to sequence with attention model is adopted to solve this problem. Experimental results on a dataset show the effectiveness of the proposed method.
机译:印刷文档数字化和蒙古文化资源利用需要蒙古光学字符识别(OCR)系统。现有的蒙古OCR系统基于分段。但是,蒙古分割比其他语言更难。因此,这些方法是高昂的昂贵和错误的痛苦。在本研究中,提出了一种基于分割的传统蒙古语识别方法。具体地,我们将OCR任务形式形式化为序列映射问题的序列,其中输入蒙古文字图像和输出文本串被视为图像帧序列和一系列字母。采用对注意模型进行序列来解决这个问题。在数据集上的实验结果显示了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号