...
首页> 外文期刊>International Journal of Computational Science and Engineering >Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework
【24h】

Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework

机译:莱斯:一种节能的FPGA CNN加速器,支持固定点训练框架

获取原文
获取原文并翻译 | 示例
           

摘要

With the development of convolutional neural networks (CNNs), their high computational complexity and energy consumption become significant problems. Many CNN inference accelerators are proposed to reduce the consumption. Most of them are based on 32-bit float-point matrix multiplication, where the data precision is over-provisioned. This paper presents Laius, an 8-bit fixed-point LeNet inference engine implemented on FPGA. To achieve low-precision computation and storage, we introduce our fixed-point training framework called FixCaffe. To economise FPGA resources, we proposed a methodology to find the optimal bit-length for weight and bias in LeNet. We use optimisations of pipelining, tiling, and theoretical analysis to improve the performance. Experiment results show that Laius achieves 44.9 Gops throughputs. Moreover, with only 1% accuracy loss, 8-bit Laius largely reduces 31.43% in delay, 87.01% in LUT consumption, 66.50% in BRAM consumption, 65.11% in DSP consumption and 47.95% in power compared to the 32-bit version with the same structure.
机译:随着卷积神经网络(CNNS)的发展,它们的高计算复杂性和能量消耗成为显着的问题。提出了许多CNN推理加速器以降低消耗。其中大多数基于32位浮点点矩阵乘法,其中通过提供数据精度。本文介绍了莱斯,在FPGA上实现了一个8位固定点Lenet推理引擎。为了实现低精度的计算和存储,我们介绍了我们的定点训练框架,称为FixCaffe。为了节省FPGA资源,我们提出了一种方法来找到LENET中的重量和偏差的最佳比特长度。我们使用流水线,平铺和理论分析的优化来提高性能。实验结果表明,Laius达到了44.9个吞吐量。此外,只有1%的精度损失,8位Laius在延迟延迟下降31.43%,在LUT消费中为87.01%,ZHAM消费量为66.50%,与32位版本相比,DSP消费量为65.11%,为47.95%结构相同。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号