首页> 外文会议>International Conference on Multimedia Big Data >Reference Based on Adaptive Attention Mechanism for Image Captioning*
【24h】

Reference Based on Adaptive Attention Mechanism for Image Captioning*

机译:基于自适应注意力机制的参考图像字幕*

获取原文

摘要

Image captioning, as a conjunction of computer vision and natural language processing, is receiving increasing attention in recent years. Most existing methods are using the attention-based CNN-RNN frameworks, which can understand images effectively and generate more natural descriptions. The recent research discovers that the expression of an event can effectively promote people's understanding and result in successful detection in objects and actions. Inspired by the textual information, in this paper, we propose a Reference based on Adaptive Attention Mechanism (R-AAM) model adding the reference sentence into the attention system to correct the area where the image is concentrated. Both in the training and testing processes, the reference sentence is selected by computing the largest consensus score among the nearest images in the training data set. The reference sentence associated with an image can help select the highlight region for the generating word in time sequence. The result has shown that the generated sentences utilizing the reference sentence express the richer semantic information and fix the wrong recognition phenomenon. Our proposed R-AAM method achieves comparable performances on the well-known public dataset MSCOCO with five popular evaluation metrics.
机译:图像字幕作为计算机视觉和自然语言处理的结合,近年来受到越来越多的关注。现有的大多数方法都使用基于注意力的CNN-RNN框架,该框架可以有效地理解图像并生成更自然的描述。最近的研究发现,事件的表达可以有效地增进人们的理解,并成功地检测出对象和动作。受文本信息的启发,本文提出了一种基于自适应注意机制(R-AAM)模型的参考,在参考系统中添加参考句子以校正图像集中的区域。在训练和测试过程中,都通过计算训练数据集中最近的图像中最大的共识分数来选择参考句子。与图像关联的参考句子可以帮助按时间顺序选择生成单词的突出显示区域。结果表明,利用参考句子生成的句子表达了更丰富的语义信息,并纠正了错误的识别现象。我们提出的R-AAM方法在著名的公共数据集MSCOCO上具有五种流行的评估指标,可达到可比的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号