首页> 外文期刊>IEEE Journal of Solid-State Circuits >STICKER: An Energy-Efficient Multi-Sparsity Compatible Accelerator for Convolutional Neural Networks in 65-nm CMOS
【24h】

STICKER: An Energy-Efficient Multi-Sparsity Compatible Accelerator for Convolutional Neural Networks in 65-nm CMOS

机译:贴纸:65-NM CMOS的卷积神经网络的节能多稀稀兼容性加速器

获取原文
获取原文并翻译 | 示例
           

摘要

STICKER is an energy-efficient convolutional neural network (NN) processor. It mainly improves energy efficiency by making full use of sparsity. The network sparsity can potentially lower storage and computation requirements. However, the sparsity distribution of both activations and weights ranges from 2 to 99 in different layers or models. Therefore, good support for the sparsity distribution is the key to improve the energy efficiency. Three new features are proposed in this article to support wide sparsity distribution efficiently. First, multi-sparsity control and data flow are implemented for finer sparsity granularity support. It can automatically switch the processor among nine sparsity modes for higher energy efficiency. Second, a multi-mode hierarchical data memory which can be reconfigured for networks with different sparsity modes is designed for higher storage efficiency. Third, a multi-sparsity-compatible set-associative convolution processing element (PE) array is designed to efficiently carry out convolution operations under different sparsity modes, especially when both activations and weights are sparse. STICKER was implemented in a 65-nm CMOS technology. With its wide-range sparsity-supported capacity, the peak energy efficiency reaches 62.1 TOPS/W when sparsity ratios of both activations and weights are 5. In a completely pruned Alexnet model, STICKER achieves 2.82 TOPS/W energy efficiency 1.8 $imes $ higher than that of the state-of-the-art processors.
机译:贴纸是节能卷积神经网络(NN)处理器。它主要通过充分利用稀疏性来提高能效。网络稀疏性可能会降低存储和计算要求。然而,激活和权重的稀疏分布在不同层或模型中的2至99范围。因此,对稀疏性分布的良好支撑是提高能源效率的关键。本文提出了三种新功能,以有效地支持宽稀疏性分布。首先,实现多稀稀物控制和数据流以用于更精细的稀疏性粒度支持。它可以在九个稀疏模式中自动切换处理器,以获得更高的能量效率。其次,设计可以为具有不同稀疏模式的网络重新配置的多模式分层数据存储器,用于更高的存储效率。第三,兼容多稀稀物兼容的集合卷积处理元件(PE)阵列旨在有效地在不同的稀疏模式下进行卷积操作,尤其是当激活和权重都稀疏时。贴纸以65nm CMOS技术实施。凭借其广泛的稀疏性支持的容量,峰值能量效率达到62.1顶/倍,当时激活和重量的稀疏比率为5.在完全修剪的亚历纳型型号中,贴纸达到2.82顶级/ W能效1.8 $ times $高于最先进的处理器。

著录项

  • 来源
    《IEEE Journal of Solid-State Circuits》 |2020年第2期|465-477|共13页
  • 作者单位

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Shanghai Jiao Tong Univ Dept Micro Nano Elect Shanghai 200240 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

    Tsinghua Univ Dept Elect Engn Beijing 10084 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Accelerator; neural network (NN); sparsity;

    机译:加速器;神经网络(NN);稀疏性;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号