首页> 外文会议>International Symposium on Advanced Parallel Processing Technologies >Using Data Compression for Optimizing FPGA-Based Convolutional Neural Network Accelerators
【24h】

Using Data Compression for Optimizing FPGA-Based Convolutional Neural Network Accelerators

机译:利用数据压缩优化基于FPGA的卷积神经网络加速器

获取原文

摘要

Convolutional Neural Network (CNN) has been extensively employed in research fields including multimedia recognition, computer version, etc. Various FPGA-based accelerators for deep CNN have been proposed to achieve high energy-efficiency. For some FPGA-based CNN accelerators in embedded systems, such as UAVs, IoT, and wearable devices, their overall performance is greatly bounded by the limited data bandwidth to the on-board DRAM. In this paper, we argue that it is feasible to overcome the bandwidth bottleneck using data compression techniques. We propose an effective roofline model to explore design tradeoff between computation logic and data bandwidth after applying data compression techniques to parameters of CNNs. We implement a decompression module and a CNN accelerator on a single Xilinx VC707 FPGA board with two different compression/decompression algorithms as case studies. Under a scenario with limited data bandwidth, the peak performance of our implementation can outperform designs using previous methods by 3.2× in overall performance.
机译:卷积神经网络(CNN)已广泛使用,包括多媒体识别,计算机版等的研究领域。已经提出了用于深度CNN的各种FPGA的加速器,以实现高能量效率。对于嵌入式系统中的一些基于FPGA的CNN加速器,例如UVS,IOT和可穿戴设备,其整体性能受到车载DRAM的有限数据带宽大大限制。在本文中,我们认为使用数据压缩技术克服带宽瓶颈是可行的。我们提出了一种有效的屋顶线模型,在将数据压缩技术应用于CNN的参数之后,探索计算逻辑和数据带宽之间的设计权衡。我们在单个Xilinx VC707 FPGA板上实现了一个解压缩模块和CNN加速器,其具有两个不同的压缩/解压缩算法作为案例研究。在数据带宽有限的情况下,我们的实现的峰值性能可以在整体性能中使用以前的方法使用以前的方法来倾斜设计3.2倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号