首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >IQA: Visual Question Answering in Interactive Environments
【24h】

IQA: Visual Question Answering in Interactive Environments

机译:IQA:交互环境中的视觉问题解答

获取原文

摘要

We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene and a question, like: 'Are there any apples in the fridge?' The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and plan for a series of actions conditioned on the question. Popular reinforcement learning approaches with a single controller perform poorly on IQA owing to the large and diverse state space. We propose the Hierarchical Interactive Memory Network (HIMN), consisting of a factorized set of controllers, allowing the system to operate at multiple levels of temporal abstraction. To evaluate HIMN, we introduce IQUAD V1, a new dataset built upon AI2-THOR [35], a simulated photo-realistic environment of configurable indoor scenes with interactive objects. IQUAD V1 has 75,000 questions, each paired with a unique scene configuration. Our experiments show that our proposed model outperforms popular single controller based methods on IQUAD V1. For sample questions and results, please view our video: https://youtu.be/pXd3C-1jr98.
机译:我们介绍了交互式问题解答(IQA),它是回答需要自主代理与动态视觉环境进行交互的问题的任务。 IQA向代理商显示一个场景和一个问题,例如:“冰箱里有苹果吗?”代理人必须在场景中导航,对场景元素有直观的了解,与对象(例如打开的冰箱)进行交互,并计划一系列以问题为条件的行动。由于状态空间很大且种类繁多,使用单个控制器的流行强化学习方法在IQA上的效果很差。我们提出了分层交互式内存网络(HIMN),该网络由一组分解的控制器组成,允许系统在多个时间抽象级别上运行。为了评估HIMN,我们引入了IQUAD V1,它是基于AI2-THOR [35]的新数据集,AI2-THOR是可配置室内场景与交互对象的模拟真实感环境。 IQUAD V1有75,000个问题,每个问题都有一个独特的场景配置。我们的实验表明,我们提出的模型优于基于IQUAD V1的流行的基于单个控制器的方法。有关示例问题和结果,请观看我们的视频:https://youtu.be/pXd3C-1jr98。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号