首页> 外文期刊>Pattern Analysis and Machine Intelligence, IEEE Transactions on >Learning 3D Object Templates by Quantizing Geometry and Appearance Spaces
【24h】

Learning 3D Object Templates by Quantizing Geometry and Appearance Spaces

机译:通过量化几何和外观空间学习3D对象模板

获取原文
获取原文并翻译 | 示例

摘要

While 3D object-centered shape-based models are appealing in comparison with 2D viewer-centered appearance-based models for their lower model complexities and potentially better view generalizabilities, the learning and inference of 3D models has been much less studied in the recent literature due to two factors: i) the enormous complexities of 3D shapes in geometric space; and ii) the gap between 3D shapes and their appearances in images. This paper aims at tackling the two problems by studying an And-Or Tree (AoT) representation that consists of two parts: i) a geometry-AoT quantizing the geometry space, i.e. the possible compositions of 3D volumetric parts and 2D surfaces within the volumes; and ii) an appearance-AoT quantizing the appearance space, i.e. the appearance variations of those shapes in different views. In this AoT, an And-node decomposes an entity into constituent parts, and an Or-node represents alternative ways of decompositions. Thus it can express a combinatorial number of geometry and appearance configurations through small dictionaries of 3D shape primitives and 2D image primitives. In the quantized space, the problem of learning a 3D object template is transformed to a structure search problem which can be efficiently solved in a dynamic programming algorithm by maximizing the information gain. We focus on learning 3D car templates from the AoT and collect a new car dataset featuring more diverse views. The learned car templates integrate both the shape-based model and the appearance-based model to combine the benefits of both. In experiments, we show three aspects: 1) the AoT is more efficient than the frequently used octree method in space representation; 2) the learned 3D car template matches the state-of-the art performances on car detection and pose estimation in a public multi-view car dataset; and 3) in our new dataset, the learned 3D template solves the joint task of simultaneous object detection, pose/view estimation, and part locali- ation. It can generalize over unseen views and performs better than the version 5 of the DPM model in terms of object detection and semantic part localization.
机译:尽管与基于2D观众为中心的基于外观的模型相比,基于3D对象为中心的形状模型具有较低的模型复杂性和潜在的更好的视图通用性,但在最近的文献中,对3D模型的学习和推断的研究较少有两个因素:i)几何空间中3D形状的巨大复杂性; ii)3D形状与其在图像中的外观之间的差距。本文旨在通过研究由两部分组成的“或”树(AoT)表示来解决两个问题:i)量化几何空间的geometry-AoT,即3D体积部分和体积内2D曲面的可能组成; ii)外观-AoT,用于量化外观空间,即这些形状在不同视图中的外观变化。在此AoT中,“与”节点将实体分解为组成部分,“与”节点表示分解的替代方式。因此,它可以通过3D形状图元和2D图像图元的小词典表达几何形状和外观配置的组合数量。在量化空间中,将学习3D对象模板的问题转换为结构搜索问题,可以通过最大化信息增益在动态编程算法中有效地解决该问题。我们专注于从AoT中学习3D汽车模板,并收集具有更多不同视图的新汽车数据集。学习到的汽车模板将基于形状的模型和基于外观的模型集成在一起,以结合两者的优势。在实验中,我们展示了三个方面:1)在空间表示中,AoT比常用的octree方法更有效; 2)所学习的3D汽车模板与公共多视图汽车数据集中的汽车检测和姿态估计方面的最新性能相匹配; 3)在我们的新数据集中,学习到的3D模板解决了同时进行对象检测,姿势/视图估计和零件定位的联合任务。在对象检测和语义部分本地化方面,它可以泛化看不见的视图并比DPM模型的版本5更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号