首页> 外文会议>International Conference on Computer Vision >Attention on Attention for Image Captioning
【24h】

Attention on Attention for Image Captioning

机译:注意图像字幕注意

获取原文

摘要

Attention mechanisms are widely used in current encoder/decoder frameworks of image captioning, where a weighted average on encoded vectors is generated at each time step to guide the caption decoding process. However, the decoder has little idea of whether or how well the attended vector and the given attention query are related, which could make the decoder give misled results. In this paper, we propose an Attention on Attention (AoA) module, which extends the conventional attention mechanisms to determine the relevance between attention results and queries. AoA first generates an information vector and an attention gate using the attention result and the current context, then adds another attention by applying element-wise multiplication to them and finally obtains the attended information, the expected useful knowledge. We apply AoA to both the encoder and the decoder of our image captioning model, which we name as AoA Network (AoANet). Experiments show that AoANet outperforms all previously published methods and achieves a new state-of-the-art performance of 129.8 CIDEr-D score on MS COCO Karpathy offline test split and 129.6 CIDEr-D (C40) score on the official online testing server. Code is available at https://github.com/husthuaan/AoANet.
机译:注意机制广泛地用于当前的图像字幕的编码器/解码器框架,其中在每个时间步生成编码矢量的加权平均值以指导字幕解码过程。但是,解码器几乎不了解照看向量和给定注意力查询是否相关或如何相关,这可能会使解码器给出误导的结果。在本文中,我们提出了一个“注意注意力”(AoA)模块,该模块扩展了常规注意机制来确定注意结果和查询之间的相关性。 AoA首先使用注意力结果和当前上下文生成一个信息向量和一个注意力门,然后通过对它们进行逐元素乘法来添加另一个注意力,最后获得关注的信息,预期的有用知识。我们将AoA应用于图像字幕模型的编码器和解码器,我们将其命名为AoA Network(AoANet)。实验表明,AoANet的性能优于以前发布的所有方法,并在MS COCO Karpathy离线测试成绩中获得了129.8 CIDEr-D评分,在官方在线测试服务器上获得了129.6 CIDEr-D(C40)评分。可以从https://github.com/husthuaan/AoANet获得代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号