首页> 外文期刊>IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems >Design and Analysis of a Neural Network Inference Engine Based on Adaptive Weight Compression
【24h】

Design and Analysis of a Neural Network Inference Engine Based on Adaptive Weight Compression

机译:基于自适应权重压缩的神经网络推理机的设计与分析

获取原文
获取原文并翻译 | 示例

摘要

Neural networks generally require significant memory capacity/bandwidth to store/access a large number of synaptic weights. This paper presents design of an energy-efficient neural network inference engine based on adaptive weight compression using a JPEG image encoding algorithm. To maximize compression ratio with minimum accuracy loss, the quality factor of the JPEG encoder is adaptively controlled depending on the accuracy impact of each block. With 1% accuracy loss, the proposed approach achieves$63.4{imes }$compression for multilayer perceptron (MLP) and$31.3 {imes }$for LeNet-5 with the MNIST dataset, and$15.3 {imes }$for AlexNet and$10.2 {imes }$for ResNet-50 with ImageNet. The reduced memory requirement leads to higher throughput and lower energy for neural network inference ($3 {imes }$effective memory bandwidth and$22 {imes }$lower system energy for MLP).
机译:神经网络通常需要大量的存储容量/带宽来存储/访问大量的突触权重。本文提出了一种采用JPEG图像编码算法的基于自适应权重压缩的节能神经网络推理机的设计。为了以最小的精度损失来最大化压缩率,根据每个块的精度影响来自适应地控制JPEG编码器的质量因数。有了1%的精度损失,提出的方法就可以实现 n $ 63.4 { t​​imes} $ n多层感知器的压缩( MLP)和 n $ 31.3 { t​​imes} $ n对于带有MNIST数据集的LeNet-5,和 n $ 15.3 { t​​imes} $ n for AlexNet和 n $ 10.2 { t​​imes } $ n用于使用ImageNet的ResNet-50。减少的内存需求导致更高的吞吐量和更低的神经网络推理能量( n $ 3 { t​​imes} $ neffective内存带宽和 n <内联公式xmlns:mml = “ http://www.w3.org/1998/Math/MathML ” xmlns:xlink = “ http://www.w3.org/1999/ xlink “> $ 22 { t​​imes} $ (降低MLP的系统能量))。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号