Deep neural network with attention model for scene text recognition

Shuohao Li; Min Tang; Qiang Guo; Jun Lei; Jun Zhang

首页> 外文期刊>Computer Vision, IET >Deep neural network with attention model for scene text recognition

【24h】

Deep neural network with attention model for scene text recognition

机译：具有注意力模型的深度神经网络用于场景文本识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

The authors present a deep neural network (DNN) with attention model for scene text recognition. The proposed model does not require any segmentation of the input text image. The framework is inspired by the attention model presented recently for speech recognition and image captioning. In the proposed framework, feature extraction, feature attention and sequence recognition are integrated in a jointly trainable network. Compared with previous approaches, the following contributions are mainly made. (i) The attention model is applied into DNN to recognise scene text, and it can effectively solve the sequence recognition problem caused by variable length labels. (ii) Rigorous experiments are performed across a number of challenging benchmarks, including IIIT5K, SVT, ICDAR2003 and ICDAR2013 datasets. Results in experiments show that the proposed model is comparable or better than the state-of-the-art methods. (iii) This model only contains 6.5 million parameters. Compared with other DNN models for scene text recognition, this model has the least number of parameters so far.

机译：作者提出了一种具有注意力模型的深度神经网络（DNN），用于场景文本识别。提出的模型不需要对输入文本图像进行任何分割。该框架的灵感来自最近提出的用于语音识别和图像字幕的注意力模型。在提出的框架中，特征提取，特征注意和序列识别被集成在可联合训练的网络中。与以前的方法相比，主要做出以下贡献。（i）将注意力模型应用到DNN中识别场景文本，可以有效解决变长标签引起的序列识别问题。（ii）在包括IIIT5K，SVT，ICDAR2003和ICDAR2013数据集在内的许多具有挑战性的基准上进行了严格的实验。实验结果表明，所提出的模型可比或优于最新方法。（iii）该模型仅包含650万个参数。与用于场景文本识别的其他DNN模型相比，该模型到目前为止具有最少数量的参数。

著录项

来源
《Computer Vision, IET》 |2017年第7期|605-612|共8页
作者
Shuohao Li; Min Tang; Qiang Guo; Jun Lei; Jun Zhang;
展开▼
作者单位

National University of Defense Technology, People's Republic of China;

University of Alberta, Canada;

National University of Defense Technology, People's Republic of China;

National University of Defense Technology, People's Republic of China;

National University of Defense Technology, People's Republic of China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
feature extraction; image sequences; neural nets; text detection;

机译：特征提取;图像序列;神经网络;文本检测;

相似文献

外文文献
中文文献
专利

1. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, Computers, Materials & Continua . 2019,第1期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
2. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, 计算机、材料和连续体(英文) . 2019,第007期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
3. Convolutional recurrent neural networks with hidden Markov model bootstrap for scene text recognition [J] . Fenglei Wang, Qiang Guo, Jun Lei, Computer Vision, IET . 2017,第6期

机译：具有隐马尔可夫模型自举的卷积递归神经网络用于场景文本识别
4. Scene text recognition with deeper convolutional neural networks [C] . Zhang Yuqi, Wang Wei, Wang Liang, IEEE International Conference on Image Processing . 2015

机译：具有更深层卷积神经网络的场景文本识别
5. A neural model of scene understanding: Multiple-scale spatial and feature-based attention in scene search, learning, and recognition. [D] . Huang, Tsung-Ren. 2010

机译：场景理解的神经模型：场景搜索，学习和识别中多尺度基于空间和基于特征的注意力。
6. Pre-trained Deep Convolution Neural Network Model With Attention for Speech Emotion Recognition [O] . Hua Zhang, Ruoyun Gou, Jili Shang, 2021

机译：训练的深度卷积神经网络模型注意语音情感识别
7. Text-Attentional Convolutional Neural Networks for Scene Text Detection [O] . He, Tong, Huang, Weilin, Qiao, Yu, 2016

机译：用于场景文本检测的文本注意卷积神经网络

Deep neural network with attention model for scene text recognition

摘要

著录项

相似文献

相关主题

期刊订阅