首页> 外文会议>International Conference on Big Data and Information Analytics >FANet: An End-to-End Full Attention Mechanism Model for Multi-Oriented Scene Text Recognition
【24h】

FANet: An End-to-End Full Attention Mechanism Model for Multi-Oriented Scene Text Recognition

机译:FANET:用于多面向场景文本识别的端到端全关注机制模型

获取原文
获取外文期刊封面目录资料

摘要

In this paper, we proposed An end-to-end Multi-Oriented Scene Text Recognition Model with Full Attention Mechanism. Attention mechanism is adopted in both encoder and decoder sides of the model. At the coding end, the idea of residual attention can not only be easily combined with the current most advanced recognition model structure, but also can easily increase the depth of the network without causing model crash, so as to extract image features for encoding more pertinently. The idea of seq2seq attention translation model is adopted in the decoding end, so as to translate image features into recognized words better. This model changes the general method of detecting-slicing-recognition, but directly carries out end-to-end training to get the recognition results. The test results of the model on the two data sets show that the model can achieve even better results using the detecting-slicing-recognition method on the basis of greatly simplifying the model training steps. We call this network FANet.
机译:在本文中,我们提出了一种具有全部注意机制的端到端的多面向场景文本识别模型。在模型的编码器和解码器侧采用注意机制。在编码结束时,剩余注意的概念不仅可以轻松地与当前最先进的识别模型结构相结合,而且还可以轻松提高网络的深度而不会导致模型崩溃,从而提取更容易更新的图像功能。解码结束采用SEQ2SEQ注意力翻译模型的思想,以便更好地将图像特征转换为识别的文字。该模型改变了检测切片识别的一般方法,但直接执行端到端训练以获得识别结果。两个数据集上模型的测试结果表明,在大大简化模型训练步骤的基础上,该模型可以使用检测切片识别方法实现更好的结果。我们调用此网络粉丝。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号