Context-Aware Group Captioning via Self-Attention and Contrastive Features

机译：通过自我注意和对比功能识别上下文的群组字幕

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

While image captioning has progressed rapidly, existing works focus mainly on describing single images. In this paper, we introduce a new task, context-aware group captioning, which aims to describe a group of target images in the context of another group of related reference images. Context-aware group captioning requires not only summarizing information from both the target and reference image group but also contrasting between them. To solve this problem, we propose a framework combining self-attention mechanism with contrastive feature construction to effectively summarize common information from each image group while capturing discriminative information between them. To build the dataset for this task, we propose to group the images and generate the group captions based on single image captions using scene graphs matching. Our datasets are constructed on top of the public Conceptual Captions dataset and our new Stock Captions dataset. Experiments on the two datasets show the effectiveness of our method on this new task.

机译：在图像字幕快速发展的同时，现有作品主要集中在描述单个图像上。在本文中，我们介绍了一个新任务，即上下文感知的组标题，该任务旨在在另一组相关参考图像的上下文中描述一组目标图像。上下文感知组字幕不仅需要汇总来自目标图像组和参考图像组的信息，还需要在它们之间进行对比。为了解决这个问题，我们提出了一种将自我注意机制与对比特征构造相结合的框架，以有效地总结每个图像组的共同信息，同时捕获它们之间的区别信息。为了构建此任务的数据集，我们建议对图像进行分组并使用场景图匹配基于单个图像标题生成组标题。我们的数据集建立在公共概念字幕数据集和新的股票字幕数据集的基础上。在两个数据集上进行的实验证明了我们的方法在这项新任务上的有效性。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2020年|3437-3447|共11页
会议地点
作者
Zhuowan Li; Quan Tran; Long Mai; Zhe Lin; Alan L. Yuille;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Visualization; Computer vision; Context modeling; Training; Natural languages; Computational modeling;

机译：任务分析;可视化;计算机视觉;上下文建模;培训;自然语言;计算建模;

相似文献

外文文献
中文文献
专利

1. The Significance of self-attention over LSTM in image captioning [J] . Sreela S R, Sumam Mary Idicula International Journal of Applied Engineering Research . 2019,第16期

机译：图像标题中LSTM的自我关注的意义
2. Context-aware positional representation for self-attention networks [J] . Chen Kehai, Wang Rui, Utiyama Masao, Neurocomputing . 2021,第Sepa3期

机译：上下文感知的自我关注网络的位置表示
3. Learning visual relationship and context-aware attention for image captioning [J] . Wang Junbo, Wang Wei, Wang Liang, Pattern Recognition: The Journal of the Pattern Recognition Society . 2020,第期

机译：学习图像标题的视觉关系和背景感知注意
4. Normalized and Geometry-Aware Self-Attention Network for Image Captioning [C] . Longteng Guo, Jing Liu, Xinxin Zhu, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2020

机译：用于图像字幕的归一化和几何感知自注意网络
5. Face Captioning Using Prominent Feature Recognition [D] . Lingenfelter, Bryson. 2021

机译：面部标题使用突出特征识别
6. Post-contrast acute kidney injury – Part 1: Definition clinical features incidence role of contrast medium and risk factors [O] . Aart J. van der Molen, Peter Reimer, Ilona A. Dekkers, -1

机译：造影后急性肾脏损伤–第1部分：定义临床特征发生率造影剂的作用和危险因素
7. Normalized and Geometry-Aware Self-Attention Network for Image Captioning [O] . Longteng Guo, Jing Liu, Xinxin Zhu, 2020

机译：用于图像标题的标准化和几何感知自我关注网络

Context-Aware Group Captioning via Self-Attention and Contrastive Features

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅