Learning visual relationship and context-aware attention for image captioning

Wang Junbo; Wang Wei; Wang Liang; Wang Zhiyong; Feng David Dagan; Tan Tieniu

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Learning visual relationship and context-aware attention for image captioning

【24h】

Learning visual relationship and context-aware attention for image captioning

机译：学习图像标题的视觉关系和背景感知注意

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Image captioning which automatically generates natural language descriptions for images has attracted lots of research attentions and there have been substantial progresses with attention based captioning methods. However, most attention-based image captioning methods focus on extracting visual information in regions of interest for sentence generation and usually ignore the relational reasoning among those regions of interest in an image. Moreover, these methods do not take into account previously attended regions which can be used to guide the subsequent attention selection. In this paper, we propose a novel method to implicitly model the relationship among regions of interest in an image with a graph neural network, as well as a novel context-aware attention mechanism to guide attention selection by fully memorizing previously attended visual content. Compared with the existing attention-based image captioning methods, ours can not only learn relation-aware visual representations for image captioning, but also consider historical context information on previous attention. We perform extensive experiments on two public benchmark datasets: MS COCO and Flickr30K, and the experimental results indicate that our proposed method is able to outperform various state-of-the-art methods in terms of the widely used evaluation metrics. (C) 2019 Elsevier Ltd. All rights reserved.

机译：图像标题自动生成图像的自然语言描述已经吸引了大量的研究关注，并且基于关注的标题方法已经大幅进展。然而，基于最关注的图像标题方法侧重于提取句子生成区域中的视觉信息，并且通常忽略图像中的那些感兴趣区域之间的关系推理。此外，这些方法不考虑先前出席的区域，该区域可用于引导后续关注选择。在本文中，我们提出了一种新的方法来隐含地模拟具有图形神经网络的图像中的感兴趣区域之间的关系，以及通过完全记住先前参加的视觉内容来引导注意选择的新颖的背景感知机制。与现有的基于关注的图像标题方法相比，我们不仅可以了解图像标题的关系感知视觉表示，还可以考虑先前关注的历史背景信息。我们对两台公共基准数据集进行了广泛的实验：Coco和Flickr30K，实验结果表明，我们所提出的方法能够以广泛使用的评估指标在各种最先进的方法上表现出各种最先进的方法。（c）2019年elestvier有限公司保留所有权利。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2020年第2020期|共11页
作者
Wang Junbo; Wang Wei; Wang Liang; Wang Zhiyong; Feng David Dagan; Tan Tieniu;
展开▼
作者单位

Chinese Acad Sci CASIA Inst Automat Natl Lab Pattern Recognit NLPR CRIPAC Beijing Peoples R China;

Chinese Acad Sci CASIA Inst Automat Natl Lab Pattern Recognit NLPR CRIPAC Beijing Peoples R China;

Chinese Acad Sci CASIA Inst Automat Natl Lab Pattern Recognit NLPR CRIPAC Beijing Peoples R China;

Univ Sydney Sch Informat Technol Sydney NSW Australia;

Univ Sydney Sch Informat Technol Sydney NSW Australia;

Chinese Acad Sci CASIA Inst Automat Natl Lab Pattern Recognit NLPR CRIPAC Beijing Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Image captioning; Relational reasoning; Context-aware attention;

机译：图像标题;关系推理;情境感知注意;

相似文献

外文文献
中文文献
专利

1. Learning visual relationship and context-aware attention for image captioning [J] . Wang Junbo, Wang Wei, Wang Liang, Pattern Recognition: The Journal of the Pattern Recognition Society . 2020,第期

机译：学习图像标题的视觉关系和背景感知注意
2. Clothes image caption generation with attribute detection and visual attention model [J] . Li Xianrui, Ye Zhiling, Zhang Zhao, Pattern recognition letters . 2021,第Jana期

机译：衣服图像标题生成，具有属性检测和视觉注意模型
3. Image Captioning with a Joint Attention Mechanism by Visual Concept Samples [J] . Yuan Jin, Zhang Lei, Guo Songrui, ACM transactions on multimedia computing communications and applications . 2020,第3期

机译：通过视觉概念样本与关注机制的图像标题
4. Visual Relationship Attention for Image Captioning [C] . Zongjian Zhang, Yang Wang, Qiang Wu, International Joint Conference on Neural Networks . 2019

机译：图像字幕的视觉关系注意
5. Arabic Image Captioning Using Deep Learning with Attention [D] . Sabri, Sabri Monaf. 2021

机译：使用深入学习的阿拉伯语图像标题
6. Social Image Captioning: Exploring Visual Attention and User Attention [O] . Leiquan Wang, Xiaoliang Chu, Weishan Zhang, 2018

机译：社交图像字幕：探索视觉注意力和用户注意力
7. Context-Aware Visual Policy Network for Sequence-Level Image Captioning [O] . Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, 2018

机译：用于序列级图像标题的上下文感知视觉策略网络

Learning visual relationship and context-aware attention for image captioning

摘要

著录项

相似文献

相关主题

期刊订阅