FANet: An End-to-End Full Attention Mechanism Model for Multi-Oriented Scene Text Recognition

机译：FANET：用于多面向场景文本识别的端到端全关注机制模型

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we proposed An end-to-end Multi-Oriented Scene Text Recognition Model with Full Attention Mechanism. Attention mechanism is adopted in both encoder and decoder sides of the model. At the coding end, the idea of residual attention can not only be easily combined with the current most advanced recognition model structure, but also can easily increase the depth of the network without causing model crash, so as to extract image features for encoding more pertinently. The idea of seq2seq attention translation model is adopted in the decoding end, so as to translate image features into recognized words better. This model changes the general method of detecting-slicing-recognition, but directly carries out end-to-end training to get the recognition results. The test results of the model on the two data sets show that the model can achieve even better results using the detecting-slicing-recognition method on the basis of greatly simplifying the model training steps. We call this network FANet.

机译：在本文中，我们提出了一种具有全部注意机制的端到端的多面向场景文本识别模型。在模型的编码器和解码器侧采用注意机制。在编码结束时，剩余注意的概念不仅可以轻松地与当前最先进的识别模型结构相结合，而且还可以轻松提高网络的深度而不会导致模型崩溃，从而提取更容易更新的图像功能。解码结束采用SEQ2SEQ注意力翻译模型的思想，以便更好地将图像特征转换为识别的文字。该模型改变了检测切片识别的一般方法，但直接执行端到端训练以获得识别结果。两个数据集上模型的测试结果表明，在大大简化模型训练步骤的基础上，该模型可以使用检测切片识别方法实现更好的结果。我们调用此网络粉丝。

著录项

来源
《International Conference on Big Data and Information Analytics》|2019年|184p|共6页
会议地点
作者
Zhenyu Ding; Ziqiang Chen; Shiqing Wang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类总体结构、系统结构;
关键词
Attention Mechanism; Residual Attention; Seq2Seq Attention; Multi-Oriented Scene Text Recognition;

机译：注意机制;剩余注意;SEQ2SEQ注意;多面向场景文本识别;

相似文献

外文文献
中文文献
专利

1. End-to-end scene text recognition using tree-structured models [J] . Cunzhao Shi, Chunheng Wang, Baihua Xiao, Pattern Recognition: The Journal of the Pattern Recognition Society . 2014,第9期

机译：使用树结构模型的端到端场景文本识别
2. Cohesive Multi-Oriented Text Detection and Recognition Structure in Natural Scene Images Regions has Exposed [J] . Imran Siddiqui, Varsha Namdeo International Journal of Distributed and Parallel Systems . 2016,第6期

机译：自然场景图像区域内聚性多方向文本检测与识别结构已暴露
3. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [J] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, Data in Brief . 2020,第3期

机译：Cursive-Text：自然场景图像中的端到端核心文本识别的全面数据集
4. FANet: An End-to-End Full Attention Mechanism Model for Multi-Oriented Scene Text Recognition [C] . Zhenyu Ding, Ziqiang Chen, Shiqing Wang International Conference on Big Data and Information Analytics . 2019

机译：FANet：用于多方向场景文本识别的端到端全注意力机制模型
5. A neural model of scene understanding: Multiple-scale spatial and feature-based attention in scene search, learning, and recognition. [D] . Huang, Tsung-Ren. 2010

机译：场景理解的神经模型：场景搜索，学习和识别中多尺度基于空间和基于特征的注意力。
6. Cursive-Text: A Comprehensive Dataset for End-to-End Urdu Text Recognition in Natural Scene Images [O] . Asghar Ali Chandio, Md. Asikuzzaman, Mark Pickering, 2020

机译：草书文本：用于自然场景图像中端到端乌尔都语文本识别的综合数据集
7. A study of multi-oriented text recognition in natural scene images [O] . MONA SAUDAGAR, S.V. JAIN 2014

机译：自然场景图像中多种文本识别研究

FANet: An End-to-End Full Attention Mechanism Model for Multi-Oriented Scene Text Recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅