首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
【24h】

Where and Why are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks

机译:他们在哪里看,为什么看?共同推断人类的注意力和复杂任务中的意图

获取原文

摘要

This paper addresses a new problem - jointly inferring human attention, intentions, and tasks from videos. Given an RGB-D video where a human performs a task, we answer three questions simultaneously: 1) where the human is looking - attention prediction; 2) why the human is looking there - intention prediction; and 3) what task the human is performing - task recognition. We propose a hierarchical model of human-attention-object (HAO) which represents tasks, intentions, and attention under a unified framework. A task is represented as sequential intentions which transition to each other. An intention is composed of the human pose, attention, and objects. A beam search algorithm is adopted for inference on the HAO graph to output the attention, intention, and task results. We built a new video dataset of tasks, intentions, and attention. It contains 14 task classes, 70 intention categories, 28 object classes, 809 videos, and approximately 330,000 frames. Experiments show that our approach outperforms existing approaches.
机译:本文解决了一个新问题-通过视频共同推断出人类的注意力,意图和任务。给定一个人在执行任务的RGB-D视频,我们同时回答三个问题:1)人在看什么-注意预测; 2)人们为什么向那里看-意图预测; 3)人类正在执行的任务-任务识别。我们提出了一个人类注意力对象(HAO)的分层模型,该模型在一个统一的框架下表示任务,意图和注意力。任务表示为彼此过渡的顺序意图。意图由人类的姿势,注意力和物体组成。采用波束搜索算法对HAO图进行推理,以输出注意力,意图和任务结果。我们建立了一个新的任务,意图和注意力的视频数据集。它包含14个任务类别,70个意图类别,28个对象类别,809个视频和大约330,000个帧。实验表明,我们的方法优于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号