Deformable Mixed Domain Attention Network for Scene Text Recognition

机译：用于场景文本识别的可变形混合域注意网络

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As a hot research area in computer vision in recent years, scene text recognition is still challenging due to the large variance in irregular text. The current methods treat the recognition process as a sequence-to-sequence task and solve it by an encoder-decoder framework. In this work, we propose a DMDAN for robust scene text recognition. First, we utilize deformable convolution to strengthen the ability to adapt to irregular text. Then, mix domain visual attention and self-attention are respectively employed in the encoder and decoder, which can effectively alleviate the problem of “attention drifting”. Finally, we integrate the center loss to reduce the intra-class distances and make each class easier to distinguish. Extensive experimental results show that our model outperforms the baseline CRNN a lot and achieves a comparable performance against existing attention-based methods on both regular and irregular datasets.

机译：作为近年来计算机视觉研究的热点，由于不规则文本的巨大差异，场景文本识别仍然具有挑战性。当前的方法将识别过程视为序列到序列的任务，并通过编码器-解码器框架解决。在这项工作中，我们提出了一种用于健壮的场景文本识别的DMDAN。首先，我们利用可变形卷积来增强适应不规则文本的能力。然后，在编码器和解码器中分别采用混合域视觉注意力和自我注意力，可以有效地缓解“注意力漂移”的问题。最后，我们整合中心损失以减少班内距离，并使每个班级更容易区分。大量的实验结果表明，我们的模型在基准CRNN上的表现要好得多，并且在常规数据集和非常规数据集上都可以与现有基于注意力的方法相媲美。

著录项

来源
《IEEE International Conference on Software Engineering and Service Science》|2020年|142-145|共4页
会议地点
作者
Yangyang Huang; Wei Fang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
scene text recognition; deformable convolution; attention mechanism; center loss;

机译：场景文本识别;可变形卷积;注意机制;中心损失;

相似文献

外文文献
中文文献
专利

1. SLOAN: Scale-Adaptive Orientation Attention Network for Scene Text Recognition [J] . Pengwen Dai, Hua Zhang, Xiaochun Cao IEEE Transactions on Image Processing . 2021,第1期

机译：Sloan：场景文本识别的缩放自适应方向关注网络
2. EPAN: Effective parts attention network for scene text recognition [J] . Neurocomputing . 2020,第Feba1期

机译：EPAN：用于场景文本识别的有效零件关注网络
3. A holistic representation guided attention network for scene text recognition [J] . Yang Lu, Wang Peng, Li Hui, Neurocomputing . 2020,第Nova13期

机译：一个整体表示引导关注网络的场景文本识别
4. Recurrent Highway Networks with Attention Mechanism for Scene Text Recognition [C] . Haodong Yang, Shuohao Li, Xiaoqing Yin, International Conference on Digital Image Computing: Techniques and Applications . 2017

机译：具有注意力机制的场景文本识别循环路网
5. A neural model of scene understanding: Multiple-scale spatial and feature-based attention in scene search, learning, and recognition. [D] . Huang, Tsung-Ren. 2010

机译：场景理解的神经模型：场景搜索，学习和识别中多尺度基于空间和基于特征的注意力。
6. An Algorithm Based on Text Position Correction and Encoder-Decoder Network for Text Recognition in the Scene Image of Visual Sensors [O] . Zhiwei Huang, Jinzhao Lin, Hongzhi Yang, 2020

机译：基于文本位置校正和编解码器网络的视觉传感器场景图像文本识别算法
7. A holistic representation guided attention network for scene text recognition [O] . Lu Yang, Peng Wang, Hui Li, 2020

机译：一个整体表示引导关注网络的场景文本识别

Deformable Mixed Domain Attention Network for Scene Text Recognition

摘要

著录项

相似文献

相关主题

期刊订阅