首页> 外文会议>International Conference on Electronics, Information, and Communication >Heterogeneous system implementation of deep learning neural network for object detection in OpenCL framework
【24h】

Heterogeneous system implementation of deep learning neural network for object detection in OpenCL framework

机译:OpenCL框架中对象检测深度学习神经网络的异构系统实现

获取原文

摘要

One of the major challenges in these days is "How can we implement up-to-date object detection algorithm in the heterogeneous system?" As in 2012 Visual Object Classes Challenge (VOC)[1] have achieved a very satisfied performance of deep learning neural network (DNN) algorithm, but it depends on CUDA [2] GPU framework and can only be applied on NVIDIA accelerators. We prefer to use a more generic acceleration framework, OpenCL [3] is a golden key to achieve the requirement. Instead of CUDA for NVIDIA GPU only, OpenCL can be applied to the heterogeneous system including CPU, GPU, DSP, FPGA, etc. Heterogeneous systems are more flexible, some of them are designed for portable devices, and some are designed for low power parallel computation. These special devices play a very important role in modern life. In this paper, we present OpenCL based heterogeneous system implementation and apply DNN framework in two typical heterogeneous systems: portable system and FPGA system. Our work shows following contributions: (1) We implement a generic OpenCL based DNN object recognition framework which can executed on general GPUs (AMD, NVIDIA, etc). (2) We implement our framework on embedded system Odroid XU4 [4] by using multiple GPUs and increase 25.8% processing time. (3) We implement our framework on FPGA system and reduce the power consumption by 84.3% compared with TitanXGPU.
机译:这些日子中的主要挑战之一是“我们如何在异构系统中实施最新的对象检测算法?”如2012年的Visual Object类挑战(VOC)[1]已经实现了深度学习神经网络(DNN)算法的非常满意的性能,但这取决于CUDA [2] GPU框架,只能应用于NVIDIA加速器。我们更愿意使用更通用的加速框架,OpenCL [3]是实现要求的金钥匙。 OpenCL只能应用于NVIDIA GPU的CUDA,可以应用于包括CPU,GPU,DSP,FPGA等的异构系统。异构系统更加灵活,其中一些是为便携式设备设计的,有些是用于平行的低功率计算。这些特殊设备在现代生活中发挥着非常重要的作用。在本文中,我们在两个典型的异构系统中提供了基于OperCl基的异构系统实现,并在两个典型的异构系统中应用DNN框架:便携式系统和FPGA系统。我们的工作显示以下贡献:(1)我们实现了一种基于通用的OpenCl基于DNN对象识别框架,可以在一般GPU(AMD,NVIDIA等)上执行。 (2)我们通过使用多个GPU来实现嵌入式系统ODROID XU4 [4]的框架,并增加25.8%的处理时间。 (3)与TitanXGPU相比,我们在FPGA系统上实施了我们的FPGA系统框架,并将功耗降低了84.3 %。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号