首页> 外文会议>European Conference on Computer Vision >3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning
【24h】

3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning

机译:3d人形和姿势从一个低分辨率图象与自我监督的学习

获取原文

摘要

3D human shape and pose estimation from monocular images has been an active area of research in computer vision, having a substantial impact on the development of new applications, from activity recognition to creating virtual avatars. Existing deep learning methods for 3D human shape and pose estimation rely on relatively high-resolution input images; however, high-resolution visual content is not always available in several practical scenarios such as video surveillance and sports broadcasting. Low-resolution images in real scenarios can vary in a wide range of sizes, and a model trained in one resolution does not typically degrade gracefully across resolutions. Two common approaches to solve the problem of low-resolution input are applying super-resolution techniques to the input images which may result in visual artifacts, or simply training one model for each resolution, which is impractical in many realistic applications. To address the above issues, this paper proposes a novel algorithm called RSC-Net, which consists of a Resolution-aware network, a Self-supervision loss, and a Contrastive learning scheme. The proposed network is able to learn the 3D body shape and pose across different resolutions with a single model. The self-supervision loss encourages scale-consistency of the output, and the contrastive learning scheme enforces scale-consistency of the deep features. We show that both these new training losses provide robustness when learning 3D shape and pose in a weakly-supervised manner. Extensive experiments demonstrate that the RSC-Net can achieve consistently better results than the state-of-the-art methods for challenging low-resolution images.
机译:3D人形和单眼图像的姿势估计一直是计算机视觉中的活跃领域,对新应用的开发具有重要影响,从活动识别创建虚拟化身。用于3D人形的现有深度学习方法和姿势估计依赖于相对高分辨率的输入图像;然而,高分辨率的视觉内容并不总是在诸如视频监控和体育广播等几种实践场景中提供的。实际情况下的低分辨率图像可以在各种尺寸范围内变化,并且在一个分辨率中培训的模型通常不会跨分辨率优化地降级。解决低分辨率输入问题的两个常见方法是将超分辨率技术应用于输入图像,这可能导致视觉伪像,或者只是为每个分辨率训练一个模型,这在许多现实应用中是不切实际的。为了解决上述问题,本文提出了一种名为RSC-Net的新型算法,该算法包括分辨率感知网络,自我监督损失和对比学习方案。所提出的网络能够通过单个模型来学习3D体形和跨不同分辨率的姿态。自我监督损失鼓励输出的规模一致性,对比学习方案强制执行深度特征的规模一致性。我们表明,这两个新的培训损失在学习3D形状和以虚弱的方式姿势时提供稳健性。广泛的实验表明,RSC-NET可以始终如一地达到比最先进的方法挑战低分辨率图像的最新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号