首页> 外文会议>International Conference on Document Analysis and Recognition >A Comparative Study of Attention-Based Encoder-Decoder Approaches to Natural Scene Text Recognition
【24h】

A Comparative Study of Attention-Based Encoder-Decoder Approaches to Natural Scene Text Recognition

机译:基于关注的编码器解码方法对自然场景文本识别的比较研究

获取原文

摘要

Attention-based encoder-decoder approaches have shown promising results in scene text recognition. In the literature, models with different encoders, decoders and attention mechanisms have been proposed and compared on isolated word recognition tasks, where the models are trained on either synthetic word images or a small set of real-world images. In this paper, we investigate different components of the attention based framework and compare its performance with a CNN-DBLSTM-CTC based approach on large-scale real-world scene text sentence recognition tasks. We train character models by using more than 1.6M real-world text lines and compare their performance on test sets collected from a variety of real-world scenarios. Our results show that (1) attention on a two-dimensional feature map can yield better performance than one-dimensional one and an RNN based decoder performs better than CNN based one; (2) attention-based approaches can achieve higher recognition accuracy than CNN-DBLSTM-CTC based approaches on isolated word recognition tasks, but perform worse on sentence recognition tasks; (3) it is more effective and efficient for CNN-DBLSTM-CTC based approaches to leverage an explicit language model to boost recognition accuracy.
机译:基于注意的编码器 - 解码器方法显示了现场文本识别的有希望的结果。在文献中,已经提出了具有不同编码器,解码器和注意机制的模型,并比较了孤立的字识别任务,其中模型在合成字图像或一小组现实图像上培训。在本文中,我们调查了基于关注的框架的不同组成部分,并将其性能与基于CNN-DBLSTM-CTC的方法进行了比较大型现实世界场景文本句子识别任务。我们使用超过1.6米的真实文本线培训角色模型,并比较从各种真实情景收集的测试集上的性能。我们的结果表明,(1)注意二维特征图可以产生比一维的要素更好,并且基于RNN的解码器比基于CNN更好地执行; (2)基于注意力的方法可以实现比CNN-DBLSTM-CTC基于CNN-DBLSTM-CTC的方法在孤立字识别任务上的方法上实现更高的识别精度,但在句子识别任务上执行更糟; (3)基于CNN-DBLSTM-CTC的方法更有效和有效,以利用显式语言模型来提高识别准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号