首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Detecting and Recognizing Human-Object Interactions
【24h】

Detecting and Recognizing Human-Object Interactions

机译:检测和识别人与物体的相互作用

获取原文

摘要

To understand the visual world, a machine must not only recognize individual object instances but also how they interact. Humans are often at the center of such interactions and detecting human-object interactions is an important practical and scientific problem. In this paper, we address the task of detecting (human, verb, object) triplets in challenging everyday photos. We propose a novel model that is driven by a human-centric approach. Our hypothesis is that the appearance of a person - their pose, clothing, action - is a powerful cue for localizing the objects they are interacting with. To exploit this cue, our model learns to predict an action-specific density over target object locations based on the appearance of a detected person. Our model also jointly learns to detect people and objects, and by fusing these predictions it efficiently infers interaction triplets in a clean, jointly trained end-to-end system we call InteractNet. We validate our approach on the recently introduced Verbs in COCO (V-COCO) and HICO-DET datasets, where we show quantitatively compelling results.
机译:为了理解视觉世界,机器不仅必须识别单个对象实例,还必须识别它们如何交互。人们通常处于这种交互的中心,而检测人与对象之间的交互是一个重要的实践和科学问题。在本文中,我们解决了在具有挑战性的日常照片中检测(人,动词,宾语)三胞胎的任务。我们提出了一种以人为本的方法驱动的新颖模型。我们的假设是,一个人的外表-他们的姿势,衣服,动作-是确定与他们互动的对象的有力提示。为了利用这一线索,我们的模型学习了根据检测到的人的外表来预测目标对象位置上特定于动作的密度。我们的模型还共同学习检测人和物体,并且通过融合这些预测,可以在干净的,经过共同训练的端对端系统(我们称为InteractNet)中有效地推断出三元组。我们在最近引入的COCO(V-COCO)和HICO-DET数据集中的动词上验证了我们的方法,这些结果显示了令人信服的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号