首页> 外文会议>International Conference on Artificial Intelligence: Applications and Innovations >Detection of Big Animals on Images with Road Scenes using Deep Learning
【24h】

Detection of Big Animals on Images with Road Scenes using Deep Learning

机译:使用深度学习在道路场景图像上检测大动物

获取原文

摘要

The recognition of big animals on the images with road scenes has received little attention in modern research. There are very few specialized data sets for this task. Popular open data sets contain many images of big animals, but the most part of them is not correspond to road scenes that is necessary for on-board vision systems of unmanned vehicles. The paper describes the preparation of such a specialized data set based on Google Open Images and COCO datasets. The resulting data set contains about 20000 images of big animals of 10 classes: 'Bear', 'Fox', 'Dog', 'Horse', 'Goat', 'Sheep', 'Cow', 'Zebra', 'Elephant', 'Giraffe'. Deep learning approaches to detect these objects are researched in the paper. Authors trained and tested modern neural network architectures YOLOv3, RetinaNet R-50-FPN, Faster R-CNN R-50-FPN, Cascade R-CNN R-50-FPN. To compare the approaches the mean average precision (mAP) was determined at IoU≥50%, also their speed was calculated for input tensor sizes 640x384x3. The highest quality metrics are demonstrated by architecture YOLOv3 as for ten classes (0.78 mAP) and one joint class (0.92 mAP) detection with speed more 35 fps on NVidia Tesla V-100 32GB video card. At the same hardware, the RetinaNet R-50-FPN architecture provided recognition speed of more than 44 fps and a 13% lower mAP. The software implementation was done using the Keras and PyTorch deep learning libraries and NVidia CUDA technology. The proposed data set and neural network approach to recognizing big animals on images have shown their effectiveness and can be used in the on-board vision systems of driverless cars or in driver assistant systems.
机译:在具有道路场景的图像上对大型动物的识别在现代研究中很少受到关注。很少有专门的数据集可以完成此任务。流行的开放数据集包含许多大型动物的图像,但是其中大部分都不对应于无人驾驶汽车的车载视觉系统所必需的道路场景。本文介绍了基于Google Open Images和COCO数据集的此类专用数据集的准备。结果数据集包含约20000张10类大型动物的图像:“熊”,“狐狸”,“狗”,“马”,“山羊”,“绵羊”,“母牛”,“斑马”,“大象” ,“长颈鹿”。本文研究了检测这些对象的深度学习方法。作者培训并测试了现代神经网络体系结构YOLOv3,RetinaNet R-50-FPN,Faster R-CNN R-50-FPN,Cascade R-CNN R-50-FPN。为了比较这些方法,在IoU≥50%时确定了平均平均精度(mAP),并且还针对输入张量大小640x384x3计算了它们的速度。 YOLOv3体系结构在NVidia Tesla V-100 32GB视频卡上以十种等级(0.78 mAP)和一种联合等级(0.92 mAP)的检测速度达到了35 fps以上,证明了最高的质量指标。在相同的硬件上,RetinaNet R-50-FPN架构提供了超过44 fps的识别速度,并且mAP降低了13%。该软件的实现是使用Keras和PyTorch深度学习库以及NVidia CUDA技术完成的。所提出的用于在图像上识别大动物的数据集和神经网络方法已显示出它们的有效性,可用于无人驾驶汽车的车载视觉系统或驾驶员辅助系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号