首页> 外文会议>International Conference on Computer Vision >Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense
【24h】

Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense

机译:整体++场景理解:具有人-物体交互和物理常识的单视图3D整体场景解析和人体姿势估计

获取原文

摘要

We propose a new 3D holistic++ scene understanding problem, which jointly tackles two tasks from a single-view image: (i) holistic scene parsing and reconstruction---3D estimations of object bounding boxes, camera pose, and room layout, and (ii) 3D human pose estimation. The intuition behind is to leverage the coupled nature of these two tasks to improve the granularity and performance of scene understanding. We propose to exploit two critical and essential connections between these two tasks: (i) human-object interaction (HOI) to model the fine-grained relations between agents and objects in the scene, and (ii) physical commonsense to model the physical plausibility of the reconstructed scene. The optimal configuration of the 3D scene, represented by a parse graph, is inferred using Markov chain Monte Carlo (MCMC), which efficiently traverses through the non-differentiable joint solution space. Experimental results demonstrate that the proposed algorithm significantly improves the performance of the two tasks on three datasets, showing an improved generalization ability.
机译:我们提出了一个新的3D整体++场景理解问题,该问题可以共同解决单视图图像中的两个任务:(i)整体场景解析和重构-对对象边界框,摄像机姿势和房间布局的-3D估计,以及(ii )3D人体姿势估计。背后的直觉是利用这两个任务的耦合性质来提高场景理解的粒度和性能。我们建议利用这两个任务之间的两个关键性和必要性联系:(i)人-物交互(HOI)来建模场景中代理与对象之间的细粒度关系,以及(ii)物理常识来建模物理合理性重建场景。由解析图表示的3D场景的最佳配置是使用马尔可夫链蒙特卡洛(MCMC)来推断的,它可以有效地遍历不可微的联合解空间。实验结果表明,该算法显着提高了三个数据集上两个任务的性能,具有较高的泛化能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号