首页> 外文会议>International Conference on Digital Image Computing: Techniques and Applications >SL3D - Single Look 3D Object Detection based on RGB-D Images
【24h】

SL3D - Single Look 3D Object Detection based on RGB-D Images

机译:SL3D - 基于RGB-D图像的单个外观3D对象检测

获取原文

摘要

We present SL3D, Single Look 3D object detection approach to detect the 3D objects from the RGB-D image pair. The approach is a proposal free, single-stage 3D object detection method from RGB-D images by leveraging multi-scale feature fusion of RGB and depth feature maps, and multi-layer predictions. The method takes pair of RGB and depth images as an input and outputs predicted 3D bounding boxes. The neural network SL3D, comprises of two modules: multi-scale feature fusion and multi-layer prediction. The multi-scale feature fusion module fuses the multi-scale features from RGB and depth feature maps, which are later used by the multi-layer prediction module for 3D object detection. Each location of prediction layer is attached with a set of predefined 3D prior boxes to account for varying shapes of 3D objects. The output of the network regresses the predicted 3D bounding boxes as an offset to the set of 3D prior boxes and duplicate 3D bounding boxes are removed by applying 3D non-maximum suppression. The network is trained end-to-end on publicly available SUN RGB-D dataset. The SL3D approach with ResNeXt50 achieves 31.77 mAP on SUN RGB-D test dataset with an inference speed of approximately 4 fps, and with MobileNetV2, it achieves approximately 15 fps with a reduction of around 2 mAP. The quantitative results show that the proposed method achieves competitive performance to state-of-the-art methods on SUN RGB-D dataset with near real-time inference speed.
机译:我们呈现SL3D,单眼3D对象检测方法来检测来自RGB-D图像对的3D对象。该方法是通过利用RGB和深度特征映射的多尺度特征融合和多层预测,从RGB-D图像提供自由的单级3D对象检测方法。该方法将一对RGB和深度图像作为输入和输出预测的3D边界框。神经网络SL3D包括两个模块:多尺度特征融合和多层预测。多尺度特征融合模块熔化RGB和深度特征映射的多尺度特征,后来由多层预测模块用于3D对象检测。预测层的每个位置附加有一组预定义的3D先前框以解释用于不同的3D对象的形状。网络的输出将预测的3D边界框作为偏移量作为偏移量作为用于该组的3D先前框,并且通过应用3D非最大抑制来删除复制的3D边界框。网络在公开可用的Sun RGB-D数据集上培训结束端到端。使用RENExt50的SL3D方法实现31.77映射在Sun RGB-D测试数据集上,推断速度大约为4 FPS,并且使用MobileNetv2,它实现了大约15 FP,减少约2张图。定量结果表明,该方法对近实时推理速度的SUN RGB-D数据集进行了最先进的方法对最先进的方法实现了竞争性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号