首页> 外文会议>International Conference on Computer Vision >Aligning Latent Spaces for 3D Hand Pose Estimation
【24h】

Aligning Latent Spaces for 3D Hand Pose Estimation

机译:对齐潜在空间以进行3D手势估计

获取原文

摘要

Hand pose estimation from monocular RGB inputs is a highly challenging task. Many previous works for monocular settings only used RGB information for training despite the availability of corresponding data in other modalities such as depth maps. In this work, we propose to learn a joint latent representation that leverages other modalities as weak labels to boost the RGB-based hand pose estimator. By design, our architecture is highly flexible in embedding various diverse modalities such as heat maps, depth maps and point clouds. In particular, we find that encoding and decoding the point cloud of the hand surface can improve the quality of the joint latent representation. Experiments show that with the aid of other modalities during training, our proposed method boosts the accuracy of RGB-based hand pose estimation systems and significantly outperforms state-of-the-art on two public benchmarks.
机译:从单眼RGB输入进行手势估计是一项极富挑战性的任务。尽管有其他方式(例如深度图)可获得相应的数据,但许多以前用于单眼设置的作品仅使用RGB信息进行训练。在这项工作中,我们建议学习一个联合潜在表示,该表示形式利用其他形式作为弱标签来增强基于RGB的手势估计器。通过设计,我们的体系结构可以高度灵活地嵌入各种不同的模式,例如热图,深度图和点云。特别是,我们发现对手表面的点云进行编码和解码可以提高联合潜在表示的质量。实验表明,在训练过程中借助其他方法,我们提出的方法提高了基于RGB的手部姿势估计系统的准确性,并在两个公共基准上明显优于最新技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号