首页> 外文会议>IEEE International Conference on Computer Vision >VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation

【24h】

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation

机译：VQS：将分割链接到VQA和质疑语义细分中受监督的问题和答案

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Rich and dense human labeled datasets are among the main enabling factors for the recent advance on vision-language understanding. Many seemingly distant annotations (e.g., semantic segmentation and visual question answering (VQA)) are inherently connected in that they reveal different levels and perspectives of human understandings about the same visual scenes - and even the same set of images (e.g., of COCO). The popularity of COCO correlates those annotations and tasks. Explicitly linking them up may significantly benefit both individual tasks and the unified vision and language modeling. We present the preliminary work of linking the instance segmentations provided by COCO to the questions and answers (QAs) in the VQA dataset, and name the collected links visual questions and segmentation answers (VQS). They transfer human supervision between the previously separate tasks, offer more effective leverage to existing problems, and also open the door for new research problems and models. We study two applications of the VQS data in this paper: supervised attention for VQA and a novel question-focused semantic segmentation task. For the former, we obtain state-of-the-art results on the VQA real multiple-choice task by simply augmenting the multilayer perceptrons with some attention features that are learned using the segmentation-QA links as explicit supervision. To put the latter in perspective, we study two plausible methods and compare them to an oracle method assuming that the instance segmentations are given at the test stage.

机译：丰富和密集的人类标记数据集是最近对视力语言理解进步的主要有利因子之一。许多似乎遥远的注释（例如，语义分割和视觉问题应答（VQA））本质上连接，因为它们揭示了对同一视觉场景的人类理解的不同级别和视角 - 甚至相同的图像（例如，Coco）。 Coco的普及将这些注释和任务相关联。明确链接它们可能会显着损害个人任务和统一视觉和语言建模。我们展示了将Coco提供的实例分段链接到VQA数据集中的问题和答案（QAS）的初步工作，并将收集的链接视觉问题和分段答案（VQS）命名。他们在以前单独的任务之间转移人类监督，为现有问题提供更有效的杠杆作用，并为新的研究问题和模型开辟了门。我们在本文中研究了VQS数据的两种应用：监督VQA和一个新颖的焦点语义分段任务。对于前者来说，我们通过简单地使用使用Semonation-QA链接作为显式监督来增强多层的Perceptrons来获得最先进的VQA真实多选择任务。要将后者放在透视图中，我们研究了两个合理的方法，并将它们与oracle方法进行比较，假设实例分段在测试阶段给出。

著录项

来源
《IEEE International Conference on Computer Vision》|2017年|1491-2231p|共10页
会议地点
作者
Chuang Gan; Yandong Li; Haoxiang Li; Chen Sun; Boqing Gong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词

相似文献

外文文献
中文文献
专利

1. R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering [J] . Pan Lu, Lei Ji, Wei Zhang, SIGKDD explorations . 2018,第Udisk期

机译：R-VQA：学习具有语义关注的视觉关系事实，用于视觉问题应答
2. Semantic passage segmentation based on sentence topics for question answering [J] . Oh HJ, Myaeng SH, Jang MG Information Sciences: An International Journal . 2007,第18期

机译：基于句子主题的语义段落分割
3. Adversarial network integrating dual attention and sparse representation for semi-supervised semantic segmentation [J] . Ge Jin, Chuancai Liu, Xu Chen Information Processing & Management . 2021,第5期

机译：对抗半监督语义分割的反对网络集成了双重关注和稀疏表示
4. VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation [C] . Chuang Gan, Yandong Li, Haoxiang Li, IEEE International Conference on Computer Vision . 2017

机译：VQS：将分割链接到VQA和质疑语义细分中受监督的问题和答案
5. Context Based Multi-Image Visual Question Answering (VQA) in Deep Learning [D] . Peddinti, Sudhakar Reddy. 2018

机译：深度学习中基于上下文的多图像视觉问答（VQA）
6. An EM-based semi-supervised deep learning approach for semantic segmentation of histopathological images from radical prostatectomies [O] . Jiayun Li, William Speier, King Chung Ho, -1

机译：基于EM的半监督深度学习方法用于根治性前列腺切除术的组织病理学图像的语义分割
7. VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation [O] . Gan, Chuang, Li, Yandong, Li, Haoxiang, 2017

机译：VQs：将分段链接到受监督的问题和答案关注VQa和以问题为中心的语义分割

VQS: Linking Segmentations to Questions and Answers for Supervised Attention in VQA and Question-Focused Semantic Segmentation

摘要

著录项

相似文献

相关主题

期刊订阅