首页> 外文期刊>Image and Vision Computing >CrossFusion net: Deep 3D object detection based on RGB images and point clouds in autonomous driving
【24h】

CrossFusion net: Deep 3D object detection based on RGB images and point clouds in autonomous driving

机译:Crossfusion Net:基于RGB图像和自动驾驶点云的深度3D对象检测

获取原文
获取原文并翻译 | 示例
           

摘要

In recent years, accurate 3D detection plays an important role in a lot of applications. Autonomous driving, for instance, is one of typical representatives. This paper aims to design an accurate 3D detector that takes both LiDAR point clouds and RGB images as inputs according to the fact that both LiDAR and camera have their own merits. A deep novel end-to-end two-stream learnable architecture, CrossFusion Net, is designed to exploit features from both LiDAR point clouds as well as RGB images through a hierarchical fusion structure. Specifically, CrossFusion Net utilizes bird's eye view (BEV) of point clouds through projection. Besides, these two feature maps of different streams are fused through the newly introduced CrossFusion(CF) layer. The proposed CF layer transforms feature maps of one stream to another based on the spatial relationship between the BEV and RGB images. Additionally, we apply attention mechanism on the transformed feature map and the original one to automatically decide the importance of the two feature maps from the two sensors. Experiments on the challenging KITTI car 3D detection benchmark and BEV detection benchmark show that the presented approach outperforms the other state-of-the-art methods in average precision(AP), specifically, as well as outperforms UberATG-ContFuse [3] of 8% AP in moderate 3D car detection. Furthermore, the proposed network learns an effective representation in perception of circumstances via RGB feature maps and BEV feature maps. (C) 2020 Elsevier B.V. All rights reserved.
机译:近年来,精确的3D检测在很多应用中起着重要作用。例如,自动驾驶是典型的代表之一。本文旨在设计一个精确的3D探测器,可根据激光雷达和相机具有自己的优点,将LIDAR点云和RGB图像作为输入。深入的新型端到端两流学习架构,共产网络,旨在通过分层融合结构来利用LIDAR点云的功能以及RGB图像。具体而言,通过投影利用点云的鸟瞰图(BEV)。此外,这些不同流的两个特征映射通过新引入的共燃烧(CF)层融合。所提出的CF层基于BEV和RGB图像之间的空间关系将一个流的特征映射转换为另一流。此外,我们在转换的特征映射上应用注意机制,原始的机制自动决定两个传感器的两个特征映射的重要性。具有挑战性的基蒂汽车3D检测基准和BEV检测基准的实验表明,所提出的方法在平均精度(AP)中占据了另一种最先进的方法,具体而言,以及8的优势uberatg-contfuse [3]中等3D汽车检测中的%AP。此外,所提出的网络通过RGB特征映射和BEV特征映射来了解对情节感知的有效表示。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号