An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

Baoguang Shi; Xiang Bai; Cong Yao

首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

【24h】

An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

机译：基于端到端的可训练神经网络基于图像的序列识别及其在场景文本识别中的应用

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Image-based sequence recognition has been a long-standing research topic in computer vision. In this paper, we investigate the problem of scene text recognition, which is among the most important and challenging tasks in image-based sequence recognition. A novel neural network architecture, which integrates feature extraction, sequence modeling and transcription into a unified framework, is proposed. Compared with previous systems for scene text recognition, the proposed architecture possesses four distinctive properties: (1) It is end-to-end trainable, in contrast to most of the existing algorithms whose components are separately trained and tuned. (2) It naturally handles sequences in arbitrary lengths, involving no character segmentation or horizontal scale normalization. (3) It is not confined to any predefined lexicon and achieves remarkable performances in both lexicon-free and lexicon-based scene text recognition tasks. (4) It generates an effective yet much smaller model, which is more practical for real-world application scenarios. The experiments on standard benchmarks, including the IIIT-5K, Street View Text and ICDAR datasets, demonstrate the superiority of the proposed algorithm over the prior arts. Moreover, the proposed algorithm performs well in the task of image-based music score recognition, which evidently verifies the generality of it.

机译：基于图像的序列识别已成为计算机视觉领域的长期研究课题。在本文中，我们研究了场景文本识别问题，这是基于图像的序列识别中最重要和最具挑战性的任务之一。提出了一种新颖的神经网络架构，它将特征提取，序列建模和转录集成到一个统一的框架中。与以前的场景文本识别系统相比，该体系结构具有四个独特的特性：（1）与大多数现有的算法（其组件分别经过训练和调整）相比，它是端对端可训练的。（2）它自然地处理任意长度的序列，不涉及字符分割或水平尺度归一化。（3）它不限于任何预定义的词典，并且在无词典和基于词典的场景文本识别任务中均表现出色。（4）生成有效但小得多的模型，这对于实际的应用程序场景更为实用。在包括IIIT-5K，街景文字和ICDAR数据集在内的标准基准上进行的实验证明了该算法优于现有技术的优势。此外，该算法在基于图像的乐谱识别任务中表现良好，显然证明了其通用性。

著录项

来源
《IEEE Transactions on Pattern Analysis and Machine Intelligence》 |2017年第11期|2298-2304|共7页
作者
Baoguang Shi; Xiang Bai; Cong Yao;
展开▼
作者单位

School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China;

School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China;

School of Electronic Information and Communications, Huazhong University of Science and Technology, Wuhan, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Feature extraction; Text recognition; Neural networks; Image recognition; Logic gates; Convolutional codes; Context;

机译：特征提取;文本识别;神经网络;图像识别;逻辑门;卷积码;上下文;

相似文献

外文文献
中文文献
专利

1. End-to-end Text Recognition Using Convolutional Neural Networks [J] . Inside R & D . 2018,第octa19期

机译：使用卷积神经网络的端到端文本识别
2. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [J] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, Data in Brief . 2020,第3期

机译：Cursive-Text：自然场景图像中的端到端核心文本识别的全面数据集
3. Text Detection and Recognition for Natural Scene Images Using Deep Convolutional Neural Networks [J] . Xianyu Wu, Chao Luo, Qian Zhang, Computers, Materials & Continua . 2019,第1期

机译：使用深卷积神经网络的自然场景图像的文本检测与识别
4. Attention Recurrent Neural Networks for Image-Based Sequence Text Recognition [C] . Guoqiang Zhong, Guohua Yue Asian Conference on Pattern Recognition . 2019

机译：基于图像的序列文本识别的注意递归神经网络
5. Achieving Consistent Near-Optimal Pattern Recognition Accuracy Using Particle Swarm Optimization to Pre-Train Artificial Neural Networks. [D] . Nikelshpur, Dmitry O. 2014

机译：使用粒子群优化对训练前的人工神经网络实现一致的近似最佳模式识别精度。
6. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [O] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, 2020

机译：草书文本：用于自然场景图像中端到端乌尔都语文本识别的综合数据集
7. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [O] . Shi, Baoguang, Bai, Xiang, Yao, Cong 2015

机译：基于图像序列的端到端可训练神经网络识别及其在场景文本识别中的应用
8. Application of Exponential Neural Networks to Event-Train Recognition [R] . Raeth, P. G. 1992

机译：指数神经网络在事件列车识别中的应用

An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅