首页> 外文期刊>Neurocomputing >Semantic scene completion with dense CRF from a single depth image
【24h】

Semantic scene completion with dense CRF from a single depth image

机译:通过单个深度图像的密集CRF进行语义场景完成

获取原文
获取原文并翻译 | 示例

摘要

Scene understanding is a significant research topic in computer vision, especially for robots to understand their environment intelligently. Semantic scene segmentation can help robots to identify the objects that are present in their surroundings, while semantic scene completion can enhance the ability of the robot to infer the object shape, which is pivotal for several high-level tasks. With dense Conditional Random Field (CRF), one key issue is how to construct the long-range interactions between nodes with Gaussian pairwise potentials. Another issue is what effective and efficient inference algorithms can be adapted to resolve the optimization. In this paper, we focus on semantic scene segmentation and completion optimization technology simultaneously using dense CRF based on a single depth image only. Firstly, we convert the single depth image into different down-sampled Truncated Signed Distance Function (TSDF) or flipped TSDF voxel formats, and formulate the pairwise potentials terms with such a representation. Secondly, we use the output results of an end-to-end 3D convolutional neural network named SSCNet to obtain the unary potentials. Finally, we pursue the efficiency of different CRF inference algorithms (the mean-field inference, the negative semi-definite specific difference of convex relaxation, the proximal minimization of linear programming and its variants, etc.). The proposed dense CRF and inference algorithms are evaluated on three different datasets (SUNCG, NYU, and NYUCAD). Experimental results demonstrate that the voxel-level intersection over union (IoU) of predicted voxel's semantic and completion can reach to state-of-the-art. Specifically, for voxel semantic segmentation, the highest IoU improvements are 2.6%, 1.3%, 3.1%, and for scene completion, the highest IoU improvements are 2.5%, 3.7%, 5.4%, respectively for SUNCG, NYU, and NYUCAD datasets. (C) 2018 Elsevier B.V. All rights reserved.
机译:场景理解是计算机视觉中一个重要的研究主题,特别是对于机器人来说,要智能地理解其环境。语义场景分割可以帮助机器人识别周围环境中存在的对象,而语义场景完成可以增强机器人推断对象形状的能力,这对于一些高级任务至关重要。使用密集条件随机场(CRF),一个关键问题是如何在具有高斯成对电位的节点之间构建远程相互作用。另一个问题是可以采用哪些有效和高效的推理算法来解决优化问题。在本文中,我们专注于仅基于单个深度图像的同时使用密集CRF的语义场景分割和完成优化技术。首先,我们将单个深度图像转换为不同的下采样截断符号距离函数(TSDF)或翻转的TSDF体素格式,并用这种表示形式表达成对势项。其次,我们使用名为SSCNet的端到端3D卷积神经网络的输出结果来获得一元电势。最后,我们追求不同CRF推理算法的效率(均值场推理,凸松弛的负半定负比差,线性规划及其变体的近端最小化等)。在三个不同的数据集(SUNCG,NYU和NYUCAD)上评估了建议的密集CRF和推理算法。实验结果表明,预测体素的语义和完成度的体素级交会(IoU)可以达到最新水平。具体而言,对于体素语义分段,最高的IoU改进分别为SUNCG,NYU和NYUCAD数据集,分别为2.6%,1.3%,3.1%,对于场景完成,最高的IoU改进分别为2.5%,3.7%,5.4%。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号