首页> 外文会议>International Conference on Artificial Intelligence: Applications and Innovations >Detection of Big Animals on Images with Road Scenes using Deep Learning
【24h】

Detection of Big Animals on Images with Road Scenes using Deep Learning

机译:利用深层学习检测道路场景的大型动物

获取原文

摘要

The recognition of big animals on the images with road scenes has received little attention in modern research. There are very few specialized data sets for this task. Popular open data sets contain many images of big animals, but the most part of them is not correspond to road scenes that is necessary for on-board vision systems of unmanned vehicles. The paper describes the preparation of such a specialized data set based on Google Open Images and COCO datasets. The resulting data set contains about 20000 images of big animals of 10 classes: "Bear", "Fox", "Dog", "Horse", "Goat", "Sheep", "Cow", "Zebra", "Elephant", "Giraffe". Deep learning approaches to detect these objects are researched in the paper. Authors trained and tested modern neural network architectures YOLOv3, RetinaNet R-50-FPN, Faster R-CNN R-50-FPN, Cascade R-CNN R-50-FPN. To compare the approaches the mean average precision (mAP) was determined at IoU≥50%, also their speed was calculated for input tensor sizes 640x384x3. The highest quality metrics are demonstrated by architecture YOLOv3 as for ten classes (0.78 mAP) and one joint class (0.92 mAP) detection with speed more 35 fps on NVidia Tesla V-100 32GB video card. At the same hardware, the RetinaNet R-50-FPN architecture provided recognition speed of more than 44 fps and a 13% lower mAP. The software implementation was done using the Keras and PyTorch deep learning libraries and NVidia CUDA technology. The proposed data set and neural network approach to recognizing big animals on images have shown their effectiveness and can be used in the on-board vision systems of driverless cars or in driver assistant systems.
机译:在现代研究中识别道路场景的图像上的大型动物。此任务几乎没有专门的数据集。流行的开放数据集包含许多大型动物的图像,但其中大部分的型号与无人驾驶车载车载视觉系统所需的道路场景不对应。本文介绍了基于Google Open Images和Coco DataSets的这种专用数据集的准备。由此产生的数据集包含大约20000年的大型动物10级:“熊”,“狐狸”,“狗”,“马”,“山羊”,“羊”,“牛”,“斑马”,“大象” , “长颈鹿”。纸质研究了检测这些物体的深度学习方法。作者培训和经过测试的现代神经网络架构YOLOV3,RETINANET R-50-FPN,R-CNN R-50-FPN,级联R-CNN R-50-FPN。为了比较方法,在IOU≥50%确定平均平均精度(MAP),也计算了它们的输入张量尺寸640x384x3的速度。最高质量的指标由架构yolov3作为十个类(0.78地图)和一个联合类(0.92映射)检测,在NVIDIA Tesla V-100 32GB视频卡上提供更多35 FPS的联合类(0.92映射)检测。在相同的硬件上,RetinAnet R-50-FPN架构提供了超过44 fps的识别速度和13%的下图。软件实现是使用Keras和Pytorch深度学习库和NVIDIA CUDA技术完成的。所提出的数据集和神经网络方法来识别图像上的大型动物的效果,并且可以用于无人驾驶汽车的车载视觉系统或驾驶员辅助系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号