首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
【24h】

Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks

机译:他们在哪里和为什么看?在复杂任务中共同推断人类注意力和意图

获取原文

摘要

This paper addresses a new problem - jointly inferring human attention, intentions, and tasks from videos. Given an RGB-D video where a human performs a task, we answer three questions simultaneously: 1) where the human is looking - attention prediction; 2) why the human is looking there - intention prediction; and 3) what task the human is performing - task recognition. We propose a hierarchical model of human-attention-object (HAO) which represents tasks, intentions, and attention under a unified framework. A task is represented as sequential intentions which transition to each other. An intention is composed of the human pose, attention, and objects. A beam search algorithm is adopted for inference on the HAO graph to output the attention, intention, and task results. We built a new video dataset of tasks, intentions, and attention. It contains 14 task classes, 70 intention categories, 28 object classes, 809 videos, and approximately 330,000 frames. Experiments show that our approach outperforms existing approaches.
机译:本文解决了一个新的问题 - 联合推断人类注意力,意图和来自视频的任务。给出了一个人类执行任务的RGB-D视频,我们同时回答三个问题:1)人类看着 - 注意预测; 2)为什么人类正在寻找 - 意图预测; 3)人类正在执行的任务是什么 - 任务识别。我们提出了一种分层模型的人类关注物(HAO),它代表统一框架下的任务,意图和注意力。任务表示为彼此过渡的顺序意图。意图由人类姿势,关注和物体组成。采用光束搜索算法在昊图上推断出来,以输出注意力,意图和任务结果。我们构建了一个新的任务,意图和注意力的视频数据集。它包含14个任务类,70个意图类别,28个对象类,809个视频和大约330,000帧。实验表明,我们的方法优于现有的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号