Recognizing Multiple Text Sequences from an Image by Pure End-to-End Learning

机译：通过纯端到端学习识别来自图像的多个文本序列

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address a challenging problem: recognizing multiple text sequences from an image by pure end-to-end learning. It is twofold: 1) Multiple text sequences recognition. Each image may contain multiple text sequences of different content, location and orientation, we try to recognize all these texts in the image. 2) Pure end-to-end (PEE) learning. We solve the problem in a pure end-to-end learning way where each training image is labeled by only text transcripts of the contained sequences, without any geometric annotations. Most existing works recognize multiple text sequences from an image in a non-end-to-end (NEE) or quasi-end-to-end (QEE) way, in which each image is trained with both text transcripts and text locations. Only recently, a PEE method was proposed to recognize text sequences from an image where the text sequence was split to several lines in the image. However, it cannot be directly applied to recognizing multiple text sequences from an image. So in this paper, we propose a pure end-to-end learning method to recognize multiple text sequences from an image. Our method directly learns the probability distribution of multiple sequences conditioned on each input image, and outputs multiple text transcripts with a well-designed decoding strategy. To evaluate the proposed method, we construct several datasets mainly based on an existing public dataset and two real application scenarios. Experimental results show that the proposed method can effectively recognize multiple text sequences from images, and outperforms CTC-based and attention-based baseline methods.

机译：我们解决了一个具有挑战性的问题：通过纯粹的端到端学习识别来自图像的多个文本序列。它是双重的：1）多个文本序列识别。每个图像可能包含不同内容，位置和方向的多个文本序列，我们尝试识别图像中的所有这些文本。 2）纯端到端（小便）学习。我们以纯端到端学习方式解决问题，其中每个训练图像仅由包含的序列的文本转录物标记，而没有任何几何注释。大多数现有工作识别来自非端到端（NEE）或准端到端（QEE）方式中的图像的多个文本序列，其中每个图像都用文本转录物和文本位置训练。仅近来，提出了一种小便方法来识别来自文本序列被分割到图像中的几行的图像中的文本序列。但是，它不能直接应用于识别来自图像的多个文本序列。因此，在本文中，我们提出了一种纯粹的端到端学习方法来识别来自图像的多个文本序列。我们的方法直接学习在每个输入图像上调节的多个序列的概率分布，并输出具有精心设计的解码策略的多个文本转录程序。为了评估所提出的方法，我们主要基于现有的公共数据集和两个实际应用方案构建多个数据集。实验结果表明，该方法可以有效地识别来自图像的多个文本序列，并且优于基于CTC和基于关注的基线方法。

著录项

来源
《International Conference on Pattern Recognition》|2021年|7058-7065|共8页
会议地点
作者
Zhenlong Xu; Shuigeng Zhou; Fan Bai; Zhanzhan Cheng; Yi Niu; Shiliang Pu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Performance evaluation; Learning systems; Image recognition; Text recognition; Annotations; Probability distribution;

机译：培训;绩效评估;学习系统;图像识别;文本识别;注释;概率分布;

相似文献

外文文献
中文文献
专利

1. Learning to detect localize and recognize many text objects in document images from few examples [J] . Bastien Moysset, Christopher Kermorvant, Christian Wolf International Journal on Document Analysis and Recognition . 2018,第3期

机译：学习通过几个示例来检测本地化并识别文档图像中的许多文本对象
2. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition [J] . Baoguang Shi, Xiang Bai, Cong Yao IEEE Transactions on Pattern Analysis and Machine Intelligence . 2017,第11期

机译：基于端到端的可训练神经网络基于图像的序列识别及其在场景文本识别中的应用
3. DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment [J] . Hiroyuki Fukuda, Kentaro Tomii BMC Bioinformatics . 2020,第1期

机译：Deepeca：从多序列对准中的蛋白质接触预测的端到端学习框架
4. An End-to-End OCR Text Re-organization Sequence Learning for Rich-Text Detail Image Comprehension [C] . Liangcheng Li, Feiyu Gao, Jiajun Bu, European Conference on Computer Vision . 2020

机译：用于富文本细节图像理解的端到端OCR文本重新组织序列学习
5. The effectiveness of an academic literacy intervention to help university freshmen recognize and resolve inconsistencies across multiple texts [D] . Baldwin, Patty 2014

机译：学术素养干预措施的有效性，可帮助大学新生识别并解决多种文本之间的矛盾之处
6. DeepECA: an end-to-end learning framework for protein contact prediction from a multiple sequence alignment [O] . Hiroyuki Fukuda, Kentaro Tomii 2020

机译：DeepECA：从多序列比对预测蛋白质接触的端到端学习框架
7. An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition [O] . Shi, Baoguang, Bai, Xiang, Yao, Cong 2015

机译：基于图像序列的端到端可训练神经网络识别及其在场景文本识别中的应用
8. Method and Apparatus for Recognizing Text in an Image Sequence of Scene Imagery. [R] . Myers, G. K., Bolles, R. C., Luong, Q. T., 2006

机译：用于识别场景图像的图像序列中的文本的方法和装置。

Recognizing Multiple Text Sequences from an Image by Pure End-to-End Learning

摘要

著录项

相似文献

相关主题

期刊订阅