首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >EFFICIENT DEEP CONVOLUTIONAL NEURAL NETWORKS ACCELERATOR WITHOUT MULTIPLICATION AND RETRAINING
【24h】

EFFICIENT DEEP CONVOLUTIONAL NEURAL NETWORKS ACCELERATOR WITHOUT MULTIPLICATION AND RETRAINING

机译:高效的深度卷积神经网络加速器而不乘法和再培训

获取原文

摘要

Recently, low-precision weight method has been considered as a promising scheme to efficiently implement inference of deep convolutional neural networks (DCNN). But it suffers from expensive retraining cost and accuracy degradation. In this paper, a low-bit and retraining-free quantization method, which enables DCNNs to deal inference with only shift and add operations, is proposed. The efficiency is demonstrated in terms of power consumption and chip area. Huffman coding is adopted for further compression. Then by exploring two-level systolic, an efficient hardware accelerator is introduced with respect to the given quantization strategy. Experiment results show that our method achieves higher accuracy than other low-precision networks without retraining process on ImageNet. 5× to 8× compression is obtained on popular models compared to full-precision counterparts. Furthermore, hardware implementation indicates good reduction of slices whereas maintaining throughput.
机译:最近,低精度的重量方法被认为是有效地实现深卷积神经网络(DCNN)推断的有希望的方案。但它遭受了昂贵的再培训成本和准确性劣化。在本文中,提出了一种低比特和无回收量化方法,其使DCNNS能够仅通过换档和添加操作来处理推断。在功耗和芯片区域方面证明了效率。采用霍夫曼编码进行进一步压缩。然后通过探索两个级别的收缩,相对于给定的量化策略介绍了有效的硬件加速器。实验结果表明,我们的方法比其他低精度网络达到更高的精度,而不会在想象成上再培训过程。与全精密同行相比,在流行模型上获得5×至8倍压缩。此外,硬件实现指示切片的良好减少,而维持吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号