首页> 外文期刊>Neural processing letters >Deep Captioning with Attention-Based Visual Concept Transfer Mechanism for Enriching Description
【24h】

Deep Captioning with Attention-Based Visual Concept Transfer Mechanism for Enriching Description

机译:基于关注的视觉概念转移机制的深度标题

获取原文
获取原文并翻译 | 示例

摘要

In this paper, we propose a novel deep captioning framework called Attention-based multimodal recurrent neural network with Visual Concept Transfer Mechanism (A-VCTM). There are three advantages of the proposed A-VCTM. (1) A multimodal layer is used to integrate the visual representation and context representation together, building a bridge that connects context information with visual information directly. (2) An attention mechanism is introduced to lead the model to focus on the regions corresponding to the next word to be generated (3) We propose a visual concept transfer mechanism to generate novel visual concepts and enrich the description sentences. Qualitative and quantitative results on two standard benchmarks, MSCOCO and Flickr30K show the effectiveness and practicability of the proposed A-VCTM framework.
机译:在本文中,我们提出了一种具有视觉概念转移机制(A-VCTM)的关注基于微峰经常性神经网络的新型深度标题框架。提出的A-VCTM有三个优点。 (1)多模图层用于将视觉表示和上下文表示集成在一起,构建一个连接上下文信息的桥接直接连接上下文信息。 (2)引入注意机制引入模型,专注于对应于下一个单词的区域(3),我们提出了一种视觉概念传输机制,以产生新的视觉概念并丰富描述句子。两个标准基准的定性和定量结果,MSCOCO和FLICKR30K显示了提出的A-VCTM框架的有效性和实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号