Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

Li Luo; Yakun Wu; Fei Qiao; Yi Yang; Qi Wei; Xiaobo Zhou; Yongkai Fan; Shuzheng Xu; Xinjun Liu; Huazhong Yang

首页> 外文期刊>International journal of reconfigurable computing >Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

【24h】

Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

机译：基于OpenCL的异构计算框架下基于FPGA的卷积神经网络加速器设计

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

CPU has insufficient resources to satisfy the efficient computation of the convolution neural network (CNN), especially for embedded applications. Therefore, heterogeneous computing platforms are widely used to accelerate CNN tasks, such as GPU, FPGA, and ASIC. Among these, FPGA can accelerate the computation by mapping the algorithm to the parallel hardware instead of CPU, which cannot fully exploit the parallelism. By fully using the parallelism of the neural networks structure, FPGA can reduce the computing costs and increase the computing speed. However, the development of FPGA requires great design skills. As a heterogeneous development platform, OpenCL has some advantages such as high abstraction level, short development cycle, and strong portability, which can make up for the lack of skilled designers. This paper uses Xilinx SDAccel to realize the parallel acceleration of CNN task, and it also proposes an optimizing strategy of single convolutional layer to accelerate CNN. Simulation results show that the calculation speed could be improved by adopting the proposed optimizing strategy. Compared with the baseline design, the strategy of single convolutional layer could increase the computing speed 14 times. Performance of the whole CNN task could be improved 2 times more than before, and the speed of image classification could attain more than 48 fps.

机译：CPU没有足够的资源来满足卷积神经网络（CNN）的高效计算，特别是对于嵌入式应用程序。因此，异构计算平台被广泛用于加速CNN任务，例如GPU，FPGA和ASIC。其中，FPGA可以通过将算法映射到并行硬件而不是CPU来加速计算，而并行硬件不能充分利用并行性。通过充分利用神经网络结构的并行性，FPGA可以降低计算成本并提高计算速度。但是，FPGA的开发需要出色的设计技能。作为一个异构开发平台，OpenCL具有较高的抽象级别，较短的开发周期和强大的可移植性等优点，可以弥补缺乏熟练设计人员的不足。本文利用赛灵思SDAccel实现CNN任务的并行加速，并提出了单卷积层加速CNN的优化策略。仿真结果表明，采用该优化策略可以提高计算速度。与基线设计相比，单卷积层策略可以将计算速度提高14倍。整个CNN任务的性能可以比以前提高2倍，并且图像分类的速度可以达到48 fps以上。

著录项

来源
《International journal of reconfigurable computing》 |2018年第2018期|1785892.1-1785892.10|共10页
作者
Li Luo; Yakun Wu; Fei Qiao; Yi Yang; Qi Wei; Xiaobo Zhou; Yongkai Fan; Shuzheng Xu; Xinjun Liu; Huazhong Yang;
展开▼
作者单位

Department of Electronic Science and Technology, Beijing Jiaotong University, Beijing China;

Department of Electronic Science and Technology, Beijing Jiaotong University, Beijing China;

Department of Electronic Engineering Tsinghua University, Beijing, China;

Department of Electronic Engineering Tsinghua University, Beijing, China;

Department of Electronic Engineering Tsinghua University, Beijing, China;

Department of Electronic Science and Technology, Beijing Jiaotong University, Beijing China;

China University of Petroleum, Beijing China;

Department of Electronic Engineering Tsinghua University, Beijing, China;

Department of Mechanical Engineering Tsinghua University, Beijing, China;

Department of Electronic Engineering Tsinghua University, Beijing, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-18 03:55:25

相似文献

外文文献
中文文献
专利

1. Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL [J] . Li Luo, Yakun Wu, Fei Qiao, International journal of reconfigurable computing . 2018,第1aaPagea1期

机译：基于OpenCL的异构计算框架下基于FPGA的卷积神经网络加速器设计
2. A survey of FPGA-based accelerators for convolutional neural networks [J] . Neural computing & applications . 2020,第4期

机译：基于FPGA的卷积神经网络的加速器调查
3. FFConv: An FPGA-based Accelerator for Fast Convolution Layers in Convolutional Neural Networks [J] . AFZAL AHMAD, MUHAMMAD ADEEL PASHA ACM Transactions on Embedded Computing Systems . 2020,第2期

机译：FFCONV：卷积神经网络中的快速卷积层的基于FPGA的加速器
4. Optimizing convolutional neural network on FPGA under heterogeneous computing framework with OpenCL [C] . Zhengrong Wang, Fei Qiao, Zhen Liu, IEEE Region 10 Conference . 2016

机译：使用OpenCL在异构计算框架下在FPGA上优化卷积神经网络
5. FPGA-based Accelerators for Convolutional Neural Networks on Embedded Devices [D] . Perera Miro, Jordi. 2020

机译：基于FPGA的嵌入式设备卷积神经网络的加速器
6. Computing Models for FPGA-Based Accelerators [O] . Martin C. Herbordt, Yongfeng Gu, Tom VanCourt, -1

机译：基于FPGA的加速器的计算模型
7. Systolic-CNN: An OpenCL-defined Scalable Run-time-flexible FPGA Accelerator Architecture for Accelerating Convolutional Neural Network Inference in Cloud/Edge Computing [O] . Akshay Dua, Yixing Li, Fengbo Ren 2020

机译：Systolic-CNN：用于在云/边缘计算中加速卷积神经网络推断的OpenCL定义可伸缩的运行时柔性FPGA加速器架构

Design of FPGA-Based Accelerator for Convolutional Neural Network under Heterogeneous Computing Framework with OpenCL

摘要

著录项

相似文献

相关主题

期刊订阅