...
首页> 外文期刊>Multimedia Tools and Applications >GateCap: Gated spatial and semantic attention model for image captioning
【24h】

GateCap: Gated spatial and semantic attention model for image captioning

机译:GATECAP:图像标题的门间空间和语义关注模型

获取原文
获取原文并翻译 | 示例
           

摘要

Visual attention has been widely used in deep image captioning models for its capacity of selectively aligning visual features to the corresponding words, i.e., the word-to-region alignment. In many cases, existing attention modules may not highlight task-related image regions for lack of high-level semantics. To advance captioning model, it is non-trivial for image captioning to effectively leverage high-level semantics. To defeat such issues, we propose a gated spatial and semantic attention captioning model (GateCap) which adap-tively fuses spatial attention features with semantic attention features to achieve this goal. In particular, GateCap brings into two novel aspects: 1) spatial and semantic attention features are further enhanced via triple LSTMs in a divide-and-fuse learning manner, and 2) a context gate module is explored to reweigh spatial and semantic attention features in a fair manner. Benefitting from them, GateCap could reduce the side effect of the word-to-region misalignment at a time step over subsequent word prediction, thereby possibly alleviating emergence of incorrect words during testing. Experiments on MSCOCO dataset verify the efficacy of the proposed GateCap model in terms of quantitative and qualitative results.
机译:视觉注意力已广泛用于深度图像标题模型,其能够选择性地将可视特征与相应的单词,即区域对齐方式对准。在许多情况下,现有的注意力模块可能不会突出显示与缺乏高级语义的任务相关的图像区域。为了推进标题模型,图像标题是有效利用高级语义的非琐碎。为了击败此类问题,我们提出了一个门控空间和语义关注标题模型(GATECAP),它适应了空间关注功能,具有语义关注功能来实现这一目标。特别地,GATECAP带入了两种新颖方面:1)空间和语义的关注特征通过分行和保险丝学习方式的三倍LSTM进一步增强,并且2)探索了上下文门模块以重量空间和语义关注功能一种公平的方式。受益于它们,GATECAP可以在随后的单词预测上缩短到区域字对区域错位的副作用,从而可能减轻了在测试期间减轻了不正确的单词的出现。 Mscoco DataSet的实验验证了所提出的GATECAP模型在定量和定性结果方面的功效。

著录项

  • 来源
    《Multimedia Tools and Applications》 |2020年第18期|11531-11549|共19页
  • 作者单位

    Science and Technology on Parallel and distributed Processing National University of Defense Technology Changsha 410073 China College of Computer National University of Defense Technology Changsha 410073. China;

    College of Computer National University of Defense Technology Changsha 410073. China Institute for Quantum Information State Key Laboratory of High Performance Computing National University of Defense Technology Changsha. 410073 China;

    College of Computer National University of Defense Technology Changsha 410073. China Institute for Quantum Information State Key Laboratory of High Performance Computing National University of Defense Technology Changsha. 410073 China;

    Science and Technology on Parallel and distributed Processing National University of Defense Technology Changsha 410073 China College of Computer National University of Defense Technology Changsha 410073. China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Semantic attention; Spatial attention; Context gate;

    机译:语义关注;空间注意;上下文门;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号