首页> 外文期刊>Neurocomputing >A holistic representation guided attention network for scene text recognition
【24h】

A holistic representation guided attention network for scene text recognition

机译:一个整体表示引导关注网络的场景文本识别

获取原文
获取原文并翻译 | 示例
           

摘要

Reading irregular scene text of arbitrary shape in natural images is still a challenging problem, despite the progress made recently. Many existing approaches incorporate sophisticated network structures to handle various shapes, use extra annotations for stronger supervision, or employ hard-to-train recurrent neural networks for sequence modeling. In this work, we propose a simple yet strong approach for scene text recognition. With no need to convert input images to sequence representations, we directly connect two-dimensional CNN features to an attention-based sequence decoder which guided by holistic representation. The holistic representation can guide the attention-based decoder focus on more accurate area. As no recurrent module is adopted, our model can be trained in parallel. It achieves 1.5x to 9.4x acceleration to backward pass and 1.3x to 7.9x acceleration to forward pass, compared with the RNN counterparts. The proposed model is trained with only word-level annotations. With this simple design, our method achieves state-of-the-art or competitive recognition performance on the evaluated regular and irregular scene text benchmark datasets. (C) 2020 Elsevier B.V. All rights reserved.
机译:尽管最近取得了进展,但是在自然图像中的任意形状的读取不规则形状的文本仍然是一个具有挑战性的问题。许多现有方法包括复杂的网络结构来处理各种形状,使用额外的注释来进行更强的监督,或者使用难以列车的常规神经网络进行序列建模。在这项工作中,我们提出了一种简单而强烈的现场文本识别方法。无需将输入图像转换为序列表示,我们直接将二维CNN功能连接到基于关注的序列解码器,其被整体表示引导。整体表示可以指导注意力的解码器专注于更准确的区域。由于未采用复发模块,我们的模型可以并行培训。与RNN对应物相比,它达到了1.5倍至9.4倍的加速,向后通过,向前通过1.3倍至7.9倍,以转发通行证。所提出的模型仅具有字级注释。通过这种简单的设计,我们的方法在评估的常规和不规则场景文本基准数据集上实现了最先进的或竞争识别性能。 (c)2020 Elsevier B.v.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2020年第13期|67-75|共9页
  • 作者单位

    Northwestern Polytech Univ Sch Comp Sci Xian Peoples R China|Natl Engn Lab Integrated Aerosp Ground Ocean Big Xian Peoples R China;

    Northwestern Polytech Univ Sch Comp Sci Xian Peoples R China|Natl Engn Lab Integrated Aerosp Ground Ocean Big Xian Peoples R China;

    Univ Adelaide Sch Comp Sci Adelaide SA Australia;

    MinSheng FinTech Corp Ltd Beijing Peoples R China;

    Northwestern Polytech Univ Sch Comp Sci Xian Peoples R China|Natl Engn Lab Integrated Aerosp Ground Ocean Big Xian Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Holistic Representation; Convolutional-Attention; Transformer; Scene Text Recognition;

    机译:整体代表;卷积关注;变压器;场景文本识别;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号