Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework

首页> 外文期刊>International Journal of Computational Science and Engineering >Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework

【24h】

Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework

机译：莱斯：一种节能的FPGA CNN加速器，支持固定点训练框架

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the development of convolutional neural networks (CNNs), their high computational complexity and energy consumption become significant problems. Many CNN inference accelerators are proposed to reduce the consumption. Most of them are based on 32-bit float-point matrix multiplication, where the data precision is over-provisioned. This paper presents Laius, an 8-bit fixed-point LeNet inference engine implemented on FPGA. To achieve low-precision computation and storage, we introduce our fixed-point training framework called FixCaffe. To economise FPGA resources, we proposed a methodology to find the optimal bit-length for weight and bias in LeNet. We use optimisations of pipelining, tiling, and theoretical analysis to improve the performance. Experiment results show that Laius achieves 44.9 Gops throughputs. Moreover, with only 1% accuracy loss, 8-bit Laius largely reduces 31.43% in delay, 87.01% in LUT consumption, 66.50% in BRAM consumption, 65.11% in DSP consumption and 47.95% in power compared to the 32-bit version with the same structure.

机译：随着卷积神经网络（CNNS）的发展，它们的高计算复杂性和能量消耗成为显着的问题。提出了许多CNN推理加速器以降低消耗。其中大多数基于32位浮点点矩阵乘法，其中通过提供数据精度。本文介绍了莱斯，在FPGA上实现了一个8位固定点Lenet推理引擎。为了实现低精度的计算和存储，我们介绍了我们的定点训练框架，称为FixCaffe。为了节省FPGA资源，我们提出了一种方法来找到LENET中的重量和偏差的最佳比特长度。我们使用流水线，平铺和理论分析的优化来提高性能。实验结果表明，Laius达到了44.9个吞吐量。此外，只有1％的精度损失，8位Laius在延迟延迟下降31.43％，在LUT消费中为87.01％，ZHAM消费量为66.50％，与32位版本相比，DSP消费量为65.11％，为47.95％结构相同。

著录项

来源
《International Journal of Computational Science and Engineering》 |2020年第3期|共11页
作者

展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
CNN accelerator; FPGA; inference engine; fixed-point training; data layout;

机译：CNN加速器;FPGA;推理引擎;定点训练;数据布局;

相似文献

外文文献
中文文献
专利

1. Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework [J] . International Journal of Computational Science and Engineering . 2020,第3期

机译：莱斯：一种节能的FPGA CNN加速器，支持固定点训练框架
2. Automatic Compilation of Diverse CNNs Onto High-Performance FPGA Accelerators [J] . Ma Yufei, Cao Yu, Vrudhula Sarma, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2020,第2期

机译：将各种CNN的自动编译在高性能FPGA加速器上
3. Performance Modeling for CNN Inference Accelerators on FPGA [J] . Ma Yufei, Cao Yu, Vrudhula Sarma, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2020,第4期

机译：FPGA上CNN推理加速器的性能建模
4. WiderFrame: An Automatic Customization Framework for Building CNN Accelerators on FPGAs: Work-in-Progress [C] . Lei Gong, Chao Wang, Xi Li, International Conference on Hardware/Software Codesign and System Synthesis . 2020

机译：WiderFrame：在FPGA上构建CNN加速器的自动定制框架：进行中
5. A Hybrid Partially Reconfigurable Overlay Supporting Just-In-Time Assembly of Custom Accelerators on FPGAs. [D] . Aklah, Zeyad Tariq. 2017

机译：混合的部分可重新配置的叠加层，可在FPGA上即时组装定制加速器。
6. Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale [O] . Muhuan Huang, Di Wu, Cody Hao Yu, -1

机译：数据中心规模的Blaze FPGA加速器部署的编程和运行时支持
7. F-CNN: An FPGA-based Framework for Training Convolutional Neural Networks [O] . Zhao W, Fu H, Luk W, 2016

机译：F-CNN：基于FpGa的卷积神经网络训练框架

Laius: an energy-efficient FPGA CNN accelerator with the support of a fixed-point training framework

摘要

著录项

相似文献

相关主题

期刊订阅