首页> 外文学位 >3D Object Understanding from RGB-D Data
【24h】

3D Object Understanding from RGB-D Data

机译:从RGB-D数据了解3D对象

获取原文
获取原文并翻译 | 示例

摘要

Understanding 3D objects and being able to interact with them in the physical world are essential for building intelligent computer vision systems. It has tremendous potentials for various applications ranging from augmented reality, 3D printing to robotics. It might seem simple for human to look and make sense of the visual world, it is however a complicated process for machines to accomplish similar tasks. Generally, the system is involved with a series of processes: identify and segment a target object, estimate its 3D shape and predict its pose in an open scene where the target objects may have not been seen before. Although considerable research works have been proposed to tackle these problems, they remain very challenging due to a few key issues: 1) most methods rely solely on color images for interpreting the 3D property of an object; 2) large labeled color images are expensive to get for tasks like pose estimation, limiting the ability to train powerful prediction models; 3) training data for the target object is typically required for 3D shape estimation and pose prediction, making these methods hard to scale and generalize to unseen objects.;Recently, several technological changes have created interesting opportunities for solving these fundamental vision problems. Low-cost depth sensors become widely available that provides an additional sensory input as a depth map which is very useful for extracting 3D information of the object and scene. On the other hand, with the ease of 3D object scanning with depth sensors and open access to large scale 3D model database like 3D warehouse and ShapeNet, it is possible to leverage such data to build powerful learning models. Third, machine learning algorithm like deep learning has become powerful that it starts to surpass state-of-the-art or even human performance on challenging tasks like object recognition. It is now feasible to learn rich information from large datasets in a single model.;The objective of this thesis is to leverage such emerging tools and data to solve the above mentioned challenging problems for understanding 3D objects with a new perspective by designing machine learning algorithms utilizing RGB-D data. Instead of solely depending on color images, we combine both color and depth images to achieve significantly higher performance for object segmentation. We use large collection of 3D object models to provide high quality training data and retrieve visually similar 3D CAD models from low-quality captured depth images which enables knowledge transfer from database objects to target object in an observed scene. By using content-based 3D shape retrieval, we also significantly improve pose estimation via similar proxy models without the need to create the exact 3D model as a reference.
机译:了解3D对象并能够在现实世界中与之交互对于构建智能计算机视觉系统至关重要。它具有从增强现实,3D打印到机器人技术等各种应用的巨大潜力。对于人类来说,看起来和理解视觉世界似乎很简单,但是对于机器来说,完成相似的任务是一个复杂的过程。通常,系统涉及一系列过程:识别和分割目标对象,估计其3D形状并预测在以前可能从未见过目标对象的开放场景中的姿势。尽管已经提出了解决这些问题的大量研究工作,但是由于一些关键问题,它们仍然非常具有挑战性:1)大多数方法仅依靠彩色图像来解释对象的3D属性; 2)大标签彩色图像用于姿势估计等任务很昂贵,限制了训练强大的预测模型的能力; 3)3D形状估计和姿势预测通常需要目标对象的训练数据,这使得这些方法难以缩放和推广到看不见的对象。;最近,一些技术变革为解决这些基本视觉问题创造了有趣的机会。低成本深度传感器变得广泛可用,其提供了额外的感官输入作为深度图,这对于提取对象和场景的3D信息非常有用。另一方面,通过使用深度传感器轻松进行3D对象扫描以及对3D仓库和ShapeNet等大型3D模型数据库的开放访问,可以利用这些数据来构建强大的学习模型。第三,像深度学习这样的机器学习算法已经变得强大起来,它开始在诸如对象识别之类的具有挑战性的任务上超越现有技术甚至人类的表现。现在可以在一个模型中从大型数据集中学习丰富的信息。;本论文的目的是利用新兴的工具和数据,通过设计机器学习算法,以新的视角解决上述挑战性问题,从而以新的视角理解3D对象。利用RGB-D数据。我们不仅将彩色图像与深度图像相结合,而且还不仅仅依赖于彩色图像,以实现更高的对象分割性能。我们使用大量的3D对象模型集合来提供高质量的训练数据,并从低质量的捕获深度图像中检索视觉上相似的3D CAD模型,从而使知识从数据库对象转移到观察场景中的目标对象。通过使用基于内容的3D形状检索,我们还可以通过类似的代理模型显着改善姿势估计,而无需创建精确的3D模型作为参考。

著录项

  • 作者

    Feng, Jie.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Artificial intelligence.;Computer science.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 157 p.
  • 总页数 157
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号