首页> 外文会议>ACM/IEEE Design Automation Conference >SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training

【24h】

SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training

机译：SparseTrain：有效的卷积神经网络训练利用数据流稀疏性

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Training Convolutional Neural Networks (CNNs) usually requires a large number of computational resources. In this paper, SparseTrain is proposed to accelerate CNN training by fully exploiting the sparsity. It mainly involves three levels of innovations: activation gradients pruning algorithm, sparse training dataflow, and accelerator architecture. By applying a stochastic pruning algorithm on each layer, the sparsity of back-propagation gradients can be increased dramatically without degrading training accuracy and convergence rate. Moreover, to utilize both natural sparsity (resulted from ReLU or Pooling layers) and artificial sparsity (brought by pruning algorithm), a sparse-aware architecture is proposed for training acceleration. This architecture supports forward and back-propagation of CNN by adopting 1-Dimensional convolution dataflow. We have built a cycle-accurate architecture simulator to evaluate the performance and efficiency based on the synthesized design with 14nm FinFET technologies. Evaluation results on AlexNet/ResNet show that SparseTrain could achieve about 2.7× speedup and 2.2× energy efficiency improvement on average compared with the original training process.

机译：训练卷积神经网络（CNN）通常需要大量的计算资源。在本文中，提出了SparseTrain，通过充分利用稀疏性来加速CNN训练。它主要涉及三个创新级别：激活梯度修剪算法，稀疏训练数据流和加速器体系结构。通过在每一层上应用随机修剪算法，可以在不降低训练精度和收敛速度的情况下，大幅提高反向传播梯度的稀疏性。此外，为了利用自然稀疏性（来自ReLU或Pooling层）和人工稀疏性（通过修剪算法带来），提出了一种稀疏感知架构来训练加速。该架构通过采用一维卷积数据流来支持CNN的正向和反向传播。我们构建了一个周期精确的架构模拟器，以基于14nm FinFET技术的综合设计评估性能和效率。 AlexNet / ResNet上的评估结果表明，与原始训练过程相比，SparseTrain可以平均实现2.7倍的加速和2.2倍的能源效率提高。

著录项

来源
《ACM/IEEE Design Automation Conference》|2020年|1-6|共6页
会议地点
作者
Pengcheng Dai; Jianlei Yang; Xucheng Ye; Xingzhou Cheng; Junyu Luo; Linghao Song; Yiran Chen; Weisheng Zhao;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Convolutional Neural Networks; Training; Sparse; Pruning; Accelerator; Architecture;

机译：卷积神经网络;训练;稀疏;修剪;加速器;体系结构;

相似文献

外文文献
中文文献
专利

1. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks [J] . Yu-Hsin Chen, Joel Emer, Vivienne Sze Computer architecture news . 2016,第3期

机译：Eyeriss：卷积神经网络的节能数据流的空间架构
2. An Energy-Efficient Deep Convolutional Neural Network Inference Processor With Enhanced Output Stationary Dataflow in 65-nm CMOS [J] . IEEE transactions on very large scale integration (VLSI) systems . 2020,第1期

机译：节能型深度卷积神经网络推理处理器，具有增强的65nm CMOS输出固定数据流
3. Convolutional neural networks for relevance feedback in content based image retrieval A Content based image retrieval system that exploits convolutional neural networks both for feature extraction and for relevance feedback [J] . Lorenzo Putzu, Luca Piras, Giorgio Giacinto Multimedia Tools and Applications . 2020,第37a38期

机译：基于内容的图像检索的相关反馈的卷积神经网络基于内容的图像检索系统，用于利用特征提取和相关性反馈的卷积神经网络
4. Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization [C] . Hesham Mostafa, Xin Wang International Conference on Machine Learning . 2019

机译：动态稀疏重物化参数高效培训深卷积神经网络
5. Convolutional Neural Network Acceleration on GPU by Exploiting Data Reuse. [D] . Gopalakrishnan Elango, Sindhuja. 2017

机译：通过利用数据重用在GPU上进行卷积神经网络加速。
6. Sparse Convolutional Neural Networks for Genome-Wide Prediction [O] . Patrik Waldmann, Christina Pfeiffer, Gábor Mészáros 2020

机译：全基因组预测的稀疏卷积神经网络
7. Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks [O] . Yu-Hsin Chen, Joel Emer, Vivienne Sze 2017

机译：Eyeriss：用于卷积神经网络的节能数据流的空间架构

SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅