首页> 外文会议>International Conference on Document Analysis and Recognition >A Scalable Handwritten Text Recognition System
【24h】

A Scalable Handwritten Text Recognition System

机译:可扩展的手写文本识别系统

获取原文

摘要

Many studies on (Offline) Handwritten Text Recognition (HTR) systems have focused on building state-of-the-art models for line recognition on small corpora. However, adding HTR capability to a large scale multilingual OCR system poses new challenges. This paper addresses three problems in building such systems: data, efficiency, and integration. Firstly, one of the biggest challenges is obtaining sufficient amounts of high quality training data. We address the problem by using online handwriting data collected for a large scale production online handwriting recognition system. We describe our image data generation pipeline and study how online data can be used to build HTR models. We show that the data improve the models significantly under the condition where only a small number of real images is available, which is usually the case for HTR models. It enables us to support a new script at substantially lower cost. Secondly, we propose a line recognition model based on neural networks without recurrent connections. The model achieves a comparable accuracy with LSTM-based models while allowing for better parallelism in training and inference. Finally, we present a simple way to integrate HTR models into an OCR system. These constitute a solution to bring HTR capability into a large scale OCR system.
机译:(离线)手写文本识别(HTR)系统的许多研究都集中在构建用于小型语料库的行识别的最新模型。但是,在大型多语言OCR系统中增加HTR功能会带来新的挑战。本文解决了构建此类系统的三个问题:数据,效率和集成。首先,最大的挑战之一是获得足够数量的高质量培训数据。我们通过使用为大规模生产在线手写识别系统收集的在线手写数据来解决此问题。我们描述了我们的图像数据生成管道,并研究了如何将在线数据用于构建HTR模型。我们表明,在只有少量真实图像的情况下,数据可以显着改善模型,而HTR模型通常就是这种情况。它使我们能够以较低的成本支持新脚本。其次,我们提出了一种基于神经网络的无递归连接的线识别模型。该模型达到了与基于LSTM的模型相当的准确性,同时允许在训练和推理中更好的并行性。最后,我们提出了一种将HTR模型集成到OCR系统中的简单方法。这些构成将HTR功能带入大规模OCR系统的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号