HPPU: An Energy-Efficient Sparse DNN Training Processor with Hybrid Weight Pruning

机译：HPPU：具有混合体重修剪的节能稀疏DNN训练处理器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Enlightened by the fact that deep-neural-networks (DNNs) are typically highly over-parameterized, weight-pruning-based sparse training (ST) becomes a practical method to reduce training computation and compress models. However, the previous pruning algorithms are either with a coarse-grained pattern or a fine-grained pattern. They lead to a limited pruning ratio or a drastically irregular sparsity distribution, which is computation-intensive or logic-complex for hardware implementation. Meanwhile, the current DNN processors focus on sparse inference but cannot support emerging ST techniques. This paper proposes a co-design approach where the algorithm is adapted to suit the hardware constraints and the hardware exploit the algorithm property to accelerate sparse training. We first present a novel pruning algorithm, hybrid weight pruning, including channel-wise and line-wise pruning. It reaches a considerable pruning ratio while maintaining the hardware friendly property. Then we design a hardware architecture, Hybrid Pruning Processing Unit, HPPU, to accelerate the proposed algorithm. It develops a 2-level active data selector and a sparse convolution engine, which maximize hardware utilization when handling the hybrid sparsity patterns during training. We evaluate HPPU by synthesizing it with 28nm CMOS technology. HPPU achieves 50.1% higher pruning ratio than coarse-grained pruning and 1.53× higher energy-efficiency than fine-grained pruning. The peak energy-efficiency of HPPU is 126.04TFLOPs/W, outperforming state-of-the-art trainable processor GANPU 1.67×. When training a ResNet18 model, HPPU consumes 3.72× less energy and offers 4.69× speedup, and maintains unpruned accuracy.

机译：深度神经网络（DNN）通常高度参数化的事实，基于重量灌注的稀疏训练（ST）成为减少培训计算和压缩模型的实用方法。然而，先前的修剪算法具有粗粒图案或细粒度的图案。它们导致有限的修剪比率或急性不规则的稀疏分布，这是硬件实现的计算密集型或逻辑复合体。同时，当前的DNN处理器专注于稀疏推理，但不能支持新兴的ST技术。本文提出了一种协同设计方法，其中算法适用于满足硬件约束和硬件利用算法属性以加速稀疏训练。我们首先提出了一种新颖的修剪算法，混合体重修剪，包括通道 - 明智和线路修剪。它达到了相当大的修剪比率，同时保持硬件友好的财产。然后我们设计硬件架构，混合修剪处理单元，HPPU，以加速所提出的算法。它开发了2级活动数据选择器和稀疏卷积引擎，可在培训期间处理混合稀疏模式时最大化硬件利用率。通过用28nm CMOS技术合成它来评估HPPU。 HPPU达到50.1％的修剪比率比粗粒粒度的修剪率高，能量效率高于细粒度的修剪。 HPPU的峰值节能效率为126.04TFLOPS / W，表现优于最先进的培训处理器Ganpu 1.67×。在培训Reset18模型时，HPPU消耗3.72倍的能量，并提供4.69倍的加速，并保持无偿精度。

著录项

来源
《IEEE International Conference on Artificial Intelligence Circuits and Systems》|2021年|1-4|共4页
会议地点
作者
Yang Wang; Yubin Qin; Leibo Liu; Shaojun Wei; Shouyi Yin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Training; Semiconductor device modeling; Program processors; Power demand; Convolution; Neural networks; Hardware;

机译：培训;半导体器件建模;程序处理器;电力需求;卷积;神经网络;硬件;

相似文献

外文文献
中文文献
专利

1. PNPU: An Energy-Efficient Deep-Neural-Network Learning Processor With Stochastic Coarse–Fine Level Weight Pruning and Adaptive Input/Output/Weight Zero Skipping [J] . Sangyeob Kim, Juhyoung Lee, Sanghoon Kang, . 2021,第1期

机译：PNPU：具有随机粗细水平重量修剪和自适应输入/输出/重量零点的节能深色网络学习处理器
2. EERA-DNN: An energy-efficient reconfigurable architecture for DNNs with hybrid bit-width and logarithmic multiplier [J] . Zhen Wang, Mengwen Xia, Bo Liu, IEICE Electronics Express . 2018,第8期

机译：EARE-DNN：具有混合钻头宽度和对数乘法器的DNN的节能可重新配置架构
3. Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers [J] . Tsubasa OCHIAI, Shigeki MATSUDA, Hideyuki WATANABE, IEICE transactions on information and systems . 2016,第10期

机译：混合DNN-HMM语音识别器中DNN中的说话人自适应训练本地化说话人模块
4. PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-Time Execution on Mobile Devices [C] . Xiaolong Ma, Fu-Ming Guo, Wei Niu, AAAI Conference on Artificial Intelligence . 2020

机译：PCONV：DNN重量修剪中缺少但理想的稀疏性，用于在移动设备上实时执行
5. A hybrid-scheduling approach for energy-efficient superscalar processors. [D] . Valluri, Madhavi Gopal. 2005

机译：节能超标量处理器的混合调度方法。
6. Unsupervised Adaptive Weight Pruning for Energy-Efficient Neuromorphic Systems [O] . Wenzhe Guo, Mohammed E. Fouda, Hasan Erdem Yantir, 2020

机译：用于节能神经形态系统的无监督自适应重量灌注
7. PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-Time Execution on Mobile Devices [O] . Xiaolong Ma, Fu-Ming Guo, Wei Niu, 2020

机译：PCONV：DNN重量修剪中缺失但理想的稀疏性，用于移动设备上的实时执行

HPPU: An Energy-Efficient Sparse DNN Training Processor with Hybrid Weight Pruning

摘要

著录项

相似文献

相关主题

期刊订阅