From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts

Farazi Moshiur R.; Khan Salman H.; Barnes Nick

首页> 外文期刊>Image and Vision Computing >From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts

【24h】

From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts

机译：从未知的知名：转移知识以回答关于新型视觉和语义概念的问题

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current Visual Question Answering (VQA) systems can answer intelligent questions about 'known' visual content. However, their performance drops significantly when questions about visually and linguistically 'unknown' concepts are presented during inference ('Open-world' scenario). A practical VQA systemshould be able to dealwith novel concepts in real world settings. To address this problem, we propose an exemplar-based approach that transfers learning (i.e., knowledge) from previously 'known' concepts to answer questions about the 'unknown'. We learn a highly discriminative joint embedding (JE) space, where visual and semantic features are fused to give a unified representation. Once novel concepts are presented to the model, it looks for the closest match from an exemplar set in the JE space. This auxiliary information is used alongside the given Image-Question pair to refine visual attention in a hierarchical fashion. Our novel attention model is based on a dual-attention mechanismthat combines the complementary effect of spatial and channel attention. Since handling the high dimensional exemplars on large datasets can be a significant challenge, we introduce an efficientmatching scheme that uses a compact feature description for search and retrieval. To evaluate ourmodel, we propose a newdataset for VQA, separating unknown visual and semantic concepts fromthe training set. Our approach shows significant improvements over state-of-the-art VQA models on the proposed Open-World VQA dataset and other standard VQA datasets. (c) 2020 Elsevier B.V. All rights reserved.

机译：当前的视觉问题应答（VQA）系统可以回答有关“已知”视觉内容的智能问题。然而，当在推理期间在视觉上和语言上的“未知”概念的问题时，它们的性能显着下降（'开放世界的情景）。一个实用的VQA System应该能够在真实世界中进行新的概念。为了解决这个问题，我们提出了一种基于示例的方法，该方法从以前的“已知”概念从以前的“已知”概念转交学习（即，知识）以回答有关“未知”的问题。我们学习一个高度辨别的关节嵌入（JE）空间，其中视觉和语义功能被融合以提供统一的代表。一旦提出了新颖的概念，它将寻找与JE空间中的示例集中的最接近的匹配。该辅助信息与给定的图像题配对一起使用以以分层方式改进视觉注意。我们的新型注意力模型基于双重关注机构，结合了空间和渠道注意力的互补效果。由于处理大型数据集上的高维样式可能是一个重大挑战，因此我们介绍了一种有效的匹配方案，该方案使用紧凑的特征描述来搜索和检索。为了评估我们的表模，我们向VQA提出了一个NewDataset，将未知的视觉和语义概念与训练集分开。我们的方法显示出在所提出的开放世界VQA数据集和其他标准VQA数据集上的最先进的VQA模型的显着改进。（c）2020 Elsevier B.v.保留所有权利。

著录项

来源
《Image and Vision Computing》 |2020年第11期|103985.1-103985.9|共9页
作者
Farazi Moshiur R.; Khan Salman H.; Barnes Nick;
展开▼
作者单位

Australian Natl Univ ANU Coll Engn & Comp Sci Canberra ACT 0200 Australia|CSIRO Data61 Canberra ACT 2601 Australia;

Australian Natl Univ ANU Coll Engn & Comp Sci Canberra ACT 0200 Australia|Mohamed Bin Zayed Univ Artificial Intelligence MB Abu Dhabi 0000 U Arab Emirates;

Australian Natl Univ ANU Coll Engn & Comp Sci Canberra ACT 0200 Australia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Visual Question Answering; Computer vision; Deep learning; Natural language processing; Dataset bias;

机译：视觉问题应答;计算机愿景;深入学习;自然语言处理;数据集偏见;

相似文献

外文文献
中文文献
专利

1. Learning to Recognize Visual Concepts for Visual Question Answering With Structural Label Space [J] . Gao Difei, Wang Ruiping, Shan Shiguang, Selected Topics in Signal Processing, IEEE Journal of . 2020,第3期

机译：学习识别视觉概念的视觉概念与结构标签空间应答
2. R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering [J] . Pan Lu, Lei Ji, Wei Zhang, SIGKDD explorations . 2018,第Udisk期

机译：R-VQA：学习具有语义关注的视觉关系事实，用于视觉问题应答
3. gMatch: Knowledge base question answering via semantic matching [J] . Jiao Jie, Wang Shujun, Zhang Xiaowang, Knowledge-Based Systems . 2021,第Sepa27期

机译：Gmatch：通过语义匹配回答知识库问题
4. Semantic Reasoning of Question Answering Over Heroes of the Marshes Based on Concept Knowledge Tree [C] . Ziqi Lin, Wancheng Ni, Haidong Zhang, International Symposium on Computational Intelligence and Design . 2017

机译：基于概念知识树的沼泽英雄问答题的语义推理
5. Semantic relationships in health consumer questions and physicians' answers: A basis for representing medical knowledge and for concept exploration interfaces. [D] . Slaughter, Laura A. 2002

机译：消费者健康问题和医生回答中的语义关系：代表医学知识和概念探索界面的基础。
6. Narrative review on serous primary peritoneal carcinoma of unknown primary site: four questions to be answered [O] . Elie Rassy, Tarek Assi, Stergios Boussios, 2020

机译：叙事审查未知原发性地点的浆液原发性腹膜癌：四个问题得到回答
7. From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts [O] . Moshiur R. Farazi, Salman H. Khan, Nick Barnes 2020

机译：从未知的知名：转移知识以回答关于新型视觉和语义概念的问题

From known to the unknown: Transferring knowledge to answer questions about novel visual and semantic concepts

摘要

著录项

相似文献

相关主题

期刊订阅