Fusing LIDAR and images for pedestrian detection using convolutional neural networks

机译：使用卷积神经网络融合LIDAR和图像以进行行人检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we explore various aspects of fusing LIDAR and color imagery for pedestrian detection in the context of convolutional neural networks (CNNs), which have recently become state-of-art for many vision problems. We incorporate LIDAR by up-sampling the point cloud to a dense depth map and then extracting three features representing different aspects of the 3D scene. We then use those features as extra image channels. Specifically, we leverage recent work on HHA [9] (horizontal disparity, height above ground, and angle) representations, adapting the code to work on up-sampled LIDAR rather than Microsoft Kinect depth maps. We show, for the first time, that such a representation is applicable to up-sampled LIDAR data, despite its sparsity. Since CNNs learn a deep hierarchy of feature representations, we then explore the question: At what level of representation should we fuse this additional information with the original RGB image channels? We use the KITTI pedestrian detection dataset for our exploration. We first replicate the finding that region-CNNs (R-CNNs) [8] can outperform the original proposal mechanism using only RGB images, but only if fine-tuning is employed. Then, we show that: 1) using HHA features and RGB images performs better than RGB-only, even without any fine-tuning using large RGB web data, 2) fusing RGB and HHA achieves the strongest results if done late, but, under a parameter or computational budget, is best done at the early to middle layers of the hierarchical representation, which tend to represent midlevel features rather than low (e.g. edges) or high (e.g. object class decision) level features, 3) some of the less successful methods have the most parameters, indicating that increased classification accuracy is not simply a function of increased capacity in the neural network.

机译：在本文中，我们探索了在卷积神经网络（CNN）的背景下融合LIDAR和彩色图像进行行人检测的各个方面，而卷积神经网络（CNN）最近已成为许多视觉问题的最新技术。通过将点云上采样到密集的深度图，然后提取代表3D场景不同方面的三个特征，我们将LIDAR纳入其中。然后，我们将这些功能用作额外的图像通道。具体来说，我们利用有关HHA [9]（水平视差，地面以上高度和角度）表示的最新工作，使代码适应于上采样的LIDAR而不是Microsoft Kinect深度图。我们首次表明，尽管这种表示是稀疏的，但它仍然适用于上采样的LIDAR数据。由于CNN学习了深度的特征表示层次结构，因此我们将探讨以下问题：我们应在附加表示的哪个级别将这些附加信息与原始RGB图像通道融合？我们使用KITTI行人检测数据集进行探索。我们首先复制发现，即区域CNN（R-CNN）[8]仅使用RGB图像即可胜过原始建议机制，但前提是必须采用微调。然后，我们证明：1）即使没有使用大型RGB Web数据进行任何微调，使用HHA功能和RGB图像的效果也比仅使用RGB更好，2）如果将RGB和HHA融合在一起，则进行到较晚的操作可获得最强的效果，参数或计算预算最好在层次表示的早期到中间层完成，它们倾向于表示中级特征，而不是低级（例如边缘）或高级（例如对象类决策）级特征，3）少一些成功的方法具有最多的参数，这表明增加的分类准确度不仅仅是神经网络中增加的容量的函数。

著录项

来源
《IEEE International Conference on Robotics and Automation》|2016年|2198-2205|共8页
会议地点
作者
Joel Schlosser; Christopher K. Chow; Zsolt Kira;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Pedestrian Planar LiDAR Pose (PPLP) Network for Oriented Pedestrian Detection Based on Planar LiDAR and Monocular Images [J] . Bu Fan, Le Trinh, Du Xiaoxiao, IEEE Robotics and Automation Letters . 2020,第2期

机译：基于平面激光雷达和单眼图像的行人平面平面图姿势（PPLP）网络的面向行人检测
2. Nighttime Pedestrian Detection Based on Thermal Imaging and Convolutional Neural Networks [J] . Yung-Yao Chen, Guan-Yi Li, Sin-Ye Jhong, Sensors and materials . 2020,第10期

机译：基于热成像和卷积神经网络的夜间行人检测
3. Pedestrian detection based on faster R-CNN in nighttime by fusing deep convolutional features of successive images [J] . Kim Jong Hyun, Batchuluun Ganbayar, Park Kang Ryoung Expert Systems with Application . 2018,第DECa期

机译：通过融合连续图像的深度卷积特征，在夜间基于更快的R-CNN的行人检测
4. Fusing LIDAR and images for pedestrian detection using convolutional neural networks [C] . Joel Schlosser, Christopher K. Chow, Zsolt Kira IEEE International Conference on Robotics and Automation . 2016

机译：使用卷积神经网络融合LIDAR和图像的行人检测
5. Convolutional and recurrent neural networks for pedestrian detection [D] . Balaji, Vivek Arvind. 2016

机译：用于行人检测的卷积和经常性神经网络
6. Fast automated detection of COVID-19 from medical images using convolutional neural networks [O] . Shuang Liang, Huixiang Liu, Yu Gu, 2021

机译：使用卷积神经网络的医学图像快速自动检测Covid-19
7. Combining LiDAR Space Clustering and Convolutional Neural Networks for Pedestrian Detection [O] . Matti, Damien, Ekenel, Hazım Kemal, Thiran, Jean-Philippe 2017

机译：结合LiDaR空间聚类和卷积神经网络行人检测

Fusing LIDAR and images for pedestrian detection using convolutional neural networks

摘要

著录项

相似文献

相关主题

期刊订阅