IQA: Visual Question Answering in Interactive Environments

机译：IQA：在交互式环境中应答的视觉问题

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene and a question, like: "Are there any apples in the fridge?" The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and plan for a series of actions conditioned on the question. Popular reinforcement learning approaches with a single controller perform poorly on IQA owing to the large and diverse state space. We propose the Hierarchical Interactive Memory Network (HIMN), consisting of a factorized set of controllers, allowing the system to operate at multiple levels of temporal abstraction. To evaluate HIMN, we introduce IQUAD V1, a new dataset built upon AI2-THOR [35], a simulated photo-realistic environment of configurable indoor scenes with interactive objects. IQUAD V1 has 75,000 questions, each paired with a unique scene configuration. Our experiments show that our proposed model outperforms popular single controller based methods on IQUAD V1. For sample questions and results, please view our video: https://youtu.be/pXd3C-1jr98.

机译：我们介绍交互式问题（IQA），回答问题的任务，这些问题需要自主代理与动态视觉环境交互。 IQA与场景和一个问题展示了代理人，如：“冰箱里有苹果吗？”代理必须在场景中导航，获取对场景元素的视觉解，与对象（例如，开放式冰箱）进行交互，并计划在问题上有一系列动作。由于州空间大而多样化的州空间，具有单个控制器的热门强化学习方法在IQA上表现不佳。我们提出了由分解的控制器组组成的分层交互式内存网络（HIMN），允许系统在多级时间抽象中运行。为了评估HIMN，我们介绍了IQUAD V1，这是一个基于AI2-Thor [35]的新数据集，这是一个具有交互式对象的可配置的室内场景的模拟照片现实环境。 IQuad V1有75,000个问题，每个问题都配对唯一的场景配置。我们的实验表明，我们提出的模型优于IQuad V1上基于流行的单控制器的方法。有关示例问题和结果，请查看我们的视频：https：//youtu.be/pxd3c-1jr98。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2018年|731p|共10页
会议地点
作者
Daniel Gordon; Aniruddha Kembhavi; Mohammad Rastegari; Joseph Redmon; Dieter Fox; Ali Farhadi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.41-53;
关键词
Task analysis; Navigation; Visualization; Knowledge discovery; Semantics; Planning;

机译：任务分析;导航;可视化;知识发现;语义;规划;
入库时间 2022-08-20 20:14:54

相似文献

外文文献
中文文献
专利

1. IQA: Interactive query construction in semantic question answering systems [J] . Zafar Hamid, Dubey Mohnish, Lehmann Jens, Journal of web semantics: . 2020,第Octa期

机译：IQA：语义问题应答系统中的交互式查询构建
2. Survey on Answer Validation for Indonesian Question Answering System (IQAS) [J] . Abdiansah Abdiansah, Azhari Azhari, Anny K. Sari International Journal of Intelligent Systems and Applications . 2018,第4期

机译：印尼问答系统（IQAS）的答案验证调查
3. Connecting Question Answering and Conversational Agents - Contextualizing German Questions for Interactive Question Answering Systems [J] . Ulli Waltinger, Alexa Breuing, Ipke Wachsmuth Kunstliche Intelligenz . 2012,第4期

机译：连接问答和对话代理-交互式问答系统的德语问题的语境化
4. IQA: Visual Question Answering in Interactive Environments [C] . Daniel Gordon, Aniruddha Kembhavi, Mohammad Rastegari, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：IQA：交互环境中的视觉问题解答
5. Attention Correction Mechanisms in Visual Contexts in Visual Question Answering [D] . Sharan, Komal 2018

机译：视觉问答中视觉上下文中的注意力纠正机制
6. An Effective Dense Co-Attention Networks for Visual Question Answering [O] . Shirong He, Dezhi Han 2020

机译：用于视觉问题的有效密集的联合网络
7. IQA: Interactive query construction in semantic question answering systems [O] . Hamid Zafar, Mohnish Dubey, Jens Lehmann, 2020

机译：IQA：语义问题应答系统中的交互式查询构建
8. HITIQA: A Data Driven Approach to Interactive Analytical Question Answering [R] . Small, S. , Strzalkowski, T. 2004

机译：HITIQa：交互式分析问答的数据驱动方法

IQA: Visual Question Answering in Interactive Environments

摘要

著录项

相似文献

相关主题

期刊订阅