首页> 外文期刊>International Journal of Computer Vision >Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation
【24h】

Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation

机译:使用RGB-D图像了解室内场景:自底向上分割,对象检测和语义分割

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, we address the problems of contour detection, bottom-up grouping, object detection and semantic segmentation on RGB-D data. We focus on the challenging setting of cluttered indoor scenes, and evaluate our approach on the recently introduced NYU-Depth V2 (NYUD2) dataset (Silberman et al., ECCV, 2012). We propose algorithms for object boundary detection and hierarchical segmentation that generalize the approach of Arbelaez et al. (TPAMI, 2011) by making effective use of depth information. We show that our system can label each contour with its type (depth, normal or albedo). We also propose a generic method for long-range amodal completion of surfaces and show its effectiveness in grouping. We train RGB-D object detectors by analyzing and computing histogram of oriented gradients on the depth image and using them with deformable part models (Felzenszwalb et al., TPAMI, 2010). We observe that this simple strategy for training object detectors significantly outperforms more complicated models in the literature. We then turn to the problem of semantic segmentation for which we propose an approach that classifies superpixels into the dominant object categories in the NYUD2 dataset. We design generic and class-specific features to encode the appearance and geometry of objects. We also show that additional features computed from RGB-D object detectors and scene classifiers further improves semantic segmentation accuracy. In all of these tasks, we report significant improvements over the state-of-the-art.
机译:在本文中,我们解决了在RGB-D数据上进行轮廓检测,自下而上分组,对象检测和语义分割的问题。我们专注于杂乱无章的室内场景的挑战性设置,并在最近引入的NYU-Depth V2(NYUD2)数据集上评估我们的方法(Silberman等,ECCV,2012)。我们提出了用于对象边界检测和分层分割的算法,这些算法推广了Arbelaez等人的方法。 (TPAMI,2011)通过有效利用深度信息。我们证明了我们的系统可以用其类型(深度,法线或反照率)标记每个轮廓。我们还提出了一种用于曲面的远程无模态完成的通用方法,并展示了其在分组中的有效性。我们通过分析和计算深度图像上定向梯度的直方图并将其与可变形零件模型一起使用来训练RGB-D对象检测器(Felzenszwalb等人,TPAMI,2010年)。我们观察到,这种用于训练目标检测器的简单策略明显优于文献中更复杂的模型。然后,我们转向语义分割问题,为此我们提出了一种将超像素分类为NYUD2数据集中的主要对象类别的方法。我们设计通用和特定于类的功能来编码对象的外观和几何形状。我们还表明,从RGB-D对象检测器和场景分类器计算出的其他功能进一步提高了语义分割的准确性。在所有这些任务中,我们报告了与最新技术相比的重大改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号