首页> 外文会议>Annual conference on Neural Information Processing Systems >Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths
【24h】

Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths

机译:静止图像数据集的操作和逆最佳控制来学习任务特定的Visual Scanps

获取原文

摘要

Human eye movements provide a rich source of information into the human visual information processing. The complex interplay between the task and the visual stimulus is believed to determine human eye movements, yet it is not fully understood, making it difficult to develop reliable eye movement prediction systems. Our work makes three contributions towards addressing this problem. First, we complement one of the largest and most challenging static computer vision datasets, VOC 2012 Actions, with human eye movement recordings collected under the primary task constraint of action recognition, as well as, separately, for context recognition, in order to analyze the impact of different tasks. Our dataset is unique among the eyetracking datasets of still images in terms of large scale (over 1 million fixations recorded in 9157 images) and different task controls. Second, we propose Markov models to automatically discover areas of interest (AOI) and introduce novel sequential consistency metrics based on them. Our methods can automatically determine the number, the spatial support and the transitions between AOIs, in addition to their locations. Based on such encodings, we quantitatively show that given unconstrained read-world stimuli, task instructions have significant influence on the human visual search patterns and are stable across subjects. Finally, we leverage powerful machine learning techniques and computer vision features in order to learn task-sensitive reward junctions from eye movement data within models that allow to effectively predict the human visual search patterns based on inverse optimal control. The methodology achieves state of the art scanpath modeling results.
机译:人类眼球运动提供了丰富的信息源到人的视觉信息的处理。任务和视觉刺激之间复杂的相互作用被认为是决定人眼的运动,但它尚不完全清楚,因此很难制定可靠的眼球运动预报系统。我们的工作,使对解决这一问题的三种贡献。首先,我们补充的一个最大和最具挑战性的静态计算机视觉的数据集,VOC 2012和行动,动作识别的首要任务约束下采集的人眼球运动的录音,以及,另一方面,关于上下文识别,以分析不同的任务产生影响。我们的数据是静止图像的视线追踪数据集之间唯一的大型(超过100万注视记录在9157个图像)和不同任务的控制方面。其次,我们建议马尔可夫模型自动发现的利息(AOI)领域,推出基于其新颖的顺序一致性指标。我们的方法可以自动判断除了自己的位置数量,空间支持和兴趣区域之间的过渡。基于这样的编码,我们定量显示,给予不受约束读世界的刺激,任务的指令对人的视觉搜索模式显著的影响力和跨学科的稳定。最后,我们利用强大的机器学习技术,并以学习模式,使基于逆最优控制,有效地预测人类视觉搜索模式距离眼球运动数据的任务敏感的奖励路口电脑视觉特征。该方法实现了艺术扫描路径模拟结果的状态。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号