首页> 外文会议>IEEE International Conference on Software Engineering and Service Science >Deformable Mixed Domain Attention Network for Scene Text Recognition
【24h】

Deformable Mixed Domain Attention Network for Scene Text Recognition

机译:用于场景文本识别的可变形混合域注意网络

获取原文

摘要

As a hot research area in computer vision in recent years, scene text recognition is still challenging due to the large variance in irregular text. The current methods treat the recognition process as a sequence-to-sequence task and solve it by an encoder-decoder framework. In this work, we propose a DMDAN for robust scene text recognition. First, we utilize deformable convolution to strengthen the ability to adapt to irregular text. Then, mix domain visual attention and self-attention are respectively employed in the encoder and decoder, which can effectively alleviate the problem of “attention drifting”. Finally, we integrate the center loss to reduce the intra-class distances and make each class easier to distinguish. Extensive experimental results show that our model outperforms the baseline CRNN a lot and achieves a comparable performance against existing attention-based methods on both regular and irregular datasets.
机译:作为近年来计算机视觉研究的热点,由于不规则文本的巨大差异,场景文本识别仍然具有挑战性。当前的方法将识别过程视为序列到序列的任务,并通过编码器-解码器框架解决。在这项工作中,我们提出了一种用于健壮的场景文本识别的DMDAN。首先,我们利用可变形卷积来增强适应不规则文本的能力。然后,在编码器和解码器中分别采用混合域视觉注意力和自我注意力,可以有效地缓解“注意力漂移”的问题。最后,我们整合中心损失以减少班内距离,并使每个班级更容易区分。大量的实验结果表明,我们的模型在基准CRNN上的表现要好得多,并且在常规数据集和非常规数据集上都可以与现有基于注意力的方法相媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号