...
首页> 外文期刊>IEEE transactions on multimedia >Learning Discriminative and Generative Shape Embeddings for Three-Dimensional Shape Retrieval
【24h】

Learning Discriminative and Generative Shape Embeddings for Three-Dimensional Shape Retrieval

机译:学习三维形状检索的鉴别和生成形状嵌入

获取原文
获取原文并翻译 | 示例

摘要

As an important solution for 3D shape retrieval, a multi-view shape descriptor has achieved impressive performance. One crucial part of view-based shape descriptors is to interpret 3D structures through various 2D observations. Most existing methods like MVCNN believe that a strong classification model trained with deep learning, can often provide an efficient shape embedding for 3D shape retrieval. However, these methods pay much attention to discriminative models and none of them necessarily incorporate the underlying 3D properties of the objects from 2D images. In this paper, we present a novel encoder-decoder recurrent feature aggregation network (ERFA-Net) to address this problem. Aiming at emphasizing the 3D properties of 3D shapes in the fusion of multiple view features, 3D properties prediction tasks are introduced into the 3D shape retrieval. Specifically, an image sequence of the shape is recurrently aggregated into a discriminative shape embedding based on LSTM network, and then this latent shape embedding is trained to predict the original voxel grids and estimate images of unseen viewpoints. This generation task gives an effective supervision which makes the network exploit 3D properties of shapes through various 2D images. Our method achieves the state-of-the-art performance for 3D shape retrieval, on two large-scale 3D shape datasets, ModelNet and ShapeNetCore55. Extensive experiments show that the proposed 3D representation performs robust discrimination against view occlusion, and strong generation ability for various 3D shape tasks.
机译:作为3D形状检索的重要解决方案,多视图形状描述符已经实现了令人印象深刻的性能。基于视图的形状描述符的一个重要部分是通过各种2D观察来解释3D结构。 MVCNN等大多数现有方法认为,具有深入学习的强大分类模型,通常可以提供有效的形状嵌入3D形状检索。然而,这些方法非常注重鉴别模型,并且它们都不一定是从2D图像中纳入对象的底层的3D属性。在本文中,我们介绍了一种新颖的编码器解码器复制特征聚合网络(ERFA-Net)来解决这个问题。针对在多视图特征的融合中强调3D形状的3D性能,3D属性预测任务被引入到3D形检索中。具体地,形状的图像序列是基于LSTM网络的鉴别形状嵌入的形状序列,然后训练这种潜在的形状嵌入以预测原始的Voxel网格并估计看不见的观点的估计图像。这一代任务给出了一个有效的监督,通过各种2D图像使网络利用形状的3D属性。我们的方法在两个大型3D形状数据集,ModelNet和ShapenetCore55上实现了3D形状检索的最先进的性能。广泛的实验表明,所提出的3D表示执行针对视图遮挡的鲁棒辨别,以及各种3D形状任务的强大能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号