首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation
【24h】

Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation

机译:学习规范形状空间以进行类别级6D对象姿势和大小估计

获取原文

摘要

We present a novel approach to category-level 6D object pose and size estimation. To tackle intra-class shape variations, we learn canonical shape space (CASS), a unified representation for a large variety of instances of a certain object category. In particular, CASS is modeled as the latent space of a deep generative model of canonical 3D shapes with normalized pose. We train a variational auto-encoder (VAE) for generating 3D point clouds in the canonical space from an RGBD image. The VAE is trained in a cross-category fashion, exploiting the publicly available large 3D shape repositories. Since the 3D point cloud is generated in normalized pose (with actual size), the encoder of the VAE learns view-factorized RGBD embedding. It maps an RGBD image in arbitrary view into a poseindependent 3D shape representation. Object pose is then estimated via contrasting it with a pose-dependent feature of the input RGBD extracted with a separate deep neural networks. We integrate the learning of CASS and pose and size estimation into an end-to-end trainable network, achieving the state-of-the-art performance.
机译:我们提出了一种新颖的方法来进行类别级别的6D对象姿势和大小估计。为了解决类内形状变化,我们学习规范形状空间(CASS),它是特定对象类别的多种实例的统一表示。特别是,将CASS建模为具有规范化姿势的规范3D形状的深度生成模型的潜在空间。我们训练一种变分自动编码器(VAE),以从RGBD图像在规范空间中生成3D点云。对VAE进行跨类别的培训,利用了公开提供的大型3D形状存储库。由于3D点云是在标准化姿势(具有实际大小)中生成的,因此VAE的编码器将学习基于视图的RGBD嵌入。它将任意视图中的RGBD图像映射到与姿势无关的3D形状表示中。然后,通过将对象与通过单独的深度神经网络提取的输入RGBD的与姿势相关的特征进行对比,来估算对象的姿势。我们将CASS的学习以及姿势和尺寸估计集成到端到端的可训练网络中,以实现最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号