首页> 外文期刊>IEEE Transactions on Pattern Analysis and Machine Intelligence >3D Human Pose Machines with Self-Supervised Learning
【24h】

3D Human Pose Machines with Self-Supervised Learning

机译:3d人类姿势机器与自我监督的学习

获取原文
获取原文并翻译 | 示例

摘要

Driven by recent computer vision and robotic applications, recovering 3D human poses has become increasingly important and attracted growing interests. In fact, completing this task is quite challenging due to the diverse appearances, viewpoints, occlusions and inherently geometric ambiguities inside monocular images. Most of the existing methods focus on designing some elaborate priors /constraints to directly regress 3D human poses based on the corresponding 2D human pose-aware features or 2D pose predictions. However, due to the insufficient 3D pose data for training and the domain gap between 2D space and 3D space, these methods have limited scalabilities for all practical scenarios (e.g., outdoor scene). Attempt to address this issue, this paper proposes a simple yet effective self-supervised correction mechanism to learn all intrinsic structures of human poses from abundant images. Specifically, the proposed mechanism involves two dual learning tasks, i.e., the 2D-to-3D pose transformation and 3D-to-2D pose projection, to serve as a bridge between 3D and 2D human poses in a type of "free" self-supervision for accurate 3D human pose estimation. The 2D-to-3D pose implies to sequentially regress intermediate 3D poses by transforming the pose representation from the 2D domain to the 3D domain under the sequence-dependent temporal context, while the 3D-to-2D pose projection contributes to refining the intermediate 3D poses by maintaining geometric consistency between the 2D projections of 3D poses and the estimated 2D poses. Therefore, these two dual learning tasks enable our model to adaptively learn from 3D human pose data and external large-scale 2D human pose data. We further apply our self-supervised correction mechanism to develop a 3D human pose machine, which jointly integrates the 2D spatial relationship, temporal smoothness of predictions and 3D geometric knowledge. Extensive evaluations on the Human3.6M and HumanEva-I benchmarks demonstrate the superior performance and efficiency of our framework over all the compared competing methods.
机译:由最近的计算机视觉和机器人应用驱动,恢复3D人类姿势变得越来越重要,吸引了日益增长的兴趣。事实上,完成这项任务是由于各种外观,观点,闭塞以及单眼图像中的具有固有的几何模糊性而挑战。大多数现有方法都侧重于设计一些基于相应的2D人类姿势感知特征或2D姿态预测直接回归人类姿势的一些精心讲解的前沿/约束。然而,由于培训的3D姿势数据不足和2D空间和3D空间之间的域间隙,这些方法对所有实际情况有限限制了所有实践场景(例如,户外场景)。本文试图解决这个问题,提出了一种简单而有效的自我监督的修正机制,从丰富的图像中了解人类姿势的所有内在结构。具体地,所提出的机制涉及两个双重学习任务,即2D-3D姿势变换和3D到2D姿势投影,以用作3D和2D人类之间的桥梁,以“自由”自我 - 准确3D人类姿态估计监督。通过在序列相关的时间上下文下将姿势表示从2D域从2D域转换到3D域来顺序地回归中间3D姿势,而3D到2D姿势投影有助于精炼中间3D通过维持3D姿势的2D投影和估计的2D姿势之间的几何一致性来构成。因此,这两个双学习任务使我们的模型能够自适应地从3D人类姿势数据和外部大规模2D人类姿势数据自适应学习。我们进一步应用了我们的自我监督的校正机制来开发3D人类姿势机,该机器共同集成了2D空间关系,时间平滑度和3D几何知识。对人类3.6M和Humaneva-I基准的广泛评估展示了我们对所有比较竞争方法的卓越性能和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号