首页> 外文学位 >Reasoning about Object Instances, Relations and Extents in RGBD Scenes.
【24h】

Reasoning about Object Instances, Relations and Extents in RGBD Scenes.

机译:关于RGBD场景中的对象实例,关系和范围的推理。

获取原文
获取原文并翻译 | 示例

摘要

The vast majority of literature in scene parsing can be described as semantic pixel labeling or semantic segmentation: predicting the semantic class of the object represented by each pixel in the scene. Our familiar perception of the world, however, provides a far richer representation. Firstly, rather than just being able to predict the semantic class of a location in a scene, humans are able to reason about object instances. Discriminating between a region that might represent a single object versus ten objects is a crucial and basic faculty. Secondly, rather than reasoning about objects as merely occupying the space visible from a single vantage point, we are able to quickly and easily reason about an object's true extent in 3D. Thirdly, rather than viewing a scene as a collection of objects independently existing in space, humans exhibit a representation of scenes that is highly grounded through a intuitive model of physics. Such models allow us to reason about how objects relate physically: via physical support relationships.;Instance segmentation is the task of segmenting a scene into regions which correspond to individual object instances. We argue that this task is not only closer to our own perception of the world than semantic segmentation, but also directly allows for subsequent reasoning about a scenes constituent elements. We explore various strategies for instance segmentation in indoor RGBD scenes.;Firstly, we explore tree-based instance segmentation algorithms. The utility of trees for semantic segmentation has been thoroughly demonstrated and we adapt them to instance segmentation and analyze both greedy and global approaches to inference.;Next, we investigate exemplar-based instance segmentation algorithms, in which a set of representative exemplars are chosen from a large pool of regions and pixels are assigned to exemplars. Inference can either be performed in two stages, exemplar selection followed by pixel-to-exemplar assignment, or in a single joint reasoning stage. We consider the advantages and disadvantages of each approach.;We introduce the task of support-relation prediction in which we predict which objects are physically supporting other objects. We propose an algorithm and a new set of features for performing discriminative support prediction, we demonstrate the effectiveness of our method and compare training mechanisms.;Finally, we introduce an algorithm for inferring scene and object extent. We demonstrate how reasoning about 3D extent can be done by extending known 2D methods and highlight the strengths and limitations of this approach.
机译:场景解析中的绝大多数文献都可以描述为语义像素标记或语义分割:预测场景中每个像素表示的对象的语义类别。但是,我们对世界的熟悉理解提供了更为丰富的表示。首先,人类不仅能够预测场景中某个位置的语义类别,还能够推理出对象实例。区分可能代表一个对象的区域与十个对象的区域是至关重要的基础知识。其次,我们不只是将对象仅占据从单个有利位置可见的空间进行推理,而是能够快速轻松地推断出3D对象的真实范围。第三,人类不是将场景看作是独立存在于太空中的物体的集合,而是通过直观的物理模型展示了高度基于场景的表示。这种模型使我们能够通过物理支持关系来推断对象之间的物理关系。实例分割是将场景分割成与各个对象实例相对应的区域的任务。我们认为,这项任务不仅比语义分割更接近我们对世界的感知,而且还直接允许对场景构成元素进行后续推理。我们探索了室内RGBD场景中实例分割的各种策略。首先,我们探索了基于树的实例分割算法。充分证明了树用于语义分割的效用,我们将其用于实例分割,并分析贪婪和全局推理方法。接下来,我们研究基于示例的实例分割算法,其中从中选择了一组代表性示例将大量的区域和像素分配给示例。推理可以分两个阶段执行,即示例选择,然后进行像素到示例分配,也可以在单个联合推理阶段执行。我们考虑每种方法的优缺点。我们介绍了支持关系预测的任务,其中我们预测哪些对象在物理上支持其他对象。我们提出了一种用于执行区分支持预测的算法和一组新功能,证明了该方法的有效性并比较了训练机制。最后,我们介绍了一种用于推断场景和物体范围的算法。我们演示了如何通过扩展已知的2D方法来完成3D范围的推理,并强调了这种方法的优势和局限性。

著录项

  • 作者

    Silberman, Nathan.;

  • 作者单位

    New York University.;

  • 授予单位 New York University.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2015
  • 页码 152 p.
  • 总页数 152
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号