CCR: A concise convolution rule for sparse neural network accelerators

机译：CCR：稀疏神经网络加速器的简洁卷积规则

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Convolutional Neural networks (CNNs) have achieved great success in a broad range of applications. As CNN-based methods are often both computation and memory intensive, sparse CNNs have emerged as an effective solution to reduce the amount of computation and memory accesses while maintaining the high accuracy. However, dense CNN accelerators can hardly benefit from the reduction of computations and memory accesses due to the lack of support for irregular and sparse models. This paper proposed a concise convolution rule (CCR) to diminish the gap between sparse CNNs and dense CNN accelerators. CCR transforms a sparse convolution into multiple effective and ineffective ones. The ineffective convolutions in which either the neurons or synapses are all zeros do not contribute to the final results and the computations and memory accesses can be eliminated. The effective convolutions in which both the neurons and synapses are dense can be easily mapped to the existing dense CNN accelerators. Unlike prior approaches which trade complexity for flexibility, CCR advocates a novel approach to reaping the benefits from the reduction of computation and memory accesses as well as the acceleration of the existing dense architectures without intrusive PE modifications. As a case study, we implemented a sparse CNN accelerator, SparseK, following the rationale of CCR. The experiments show that SparseK achieved a speedup of 2.9× on VGG16 compared to a comparably provisioned dense architecture. Compared with state-of-the-art sparse accelerators, SparseK can improve the performance and energy efficiency by 1.8× and 1.5×, respectively.

机译：卷积神经网络（CNN）在广泛的应用中都取得了巨大的成功。由于基于CNN的方法通常都需要大量的计算和内存，因此稀疏的CNN成为一种有效的解决方案，可以减少计算和内存访问量，同时又能保持较高的准确性。但是，由于缺乏对不规则和稀疏模型的支持，密集的CNN加速器几乎无法从减少计算和访问内存中受益。本文提出了一种简洁的卷积规则（CCR），以缩小稀疏的CNN与密集的CNN加速器之间的距离。 CCR将稀疏卷积转换为多个有效卷积和无效卷积。神经元或突触全为零的无效卷积不会影响最终结果，因此可以消除计算和内存访问。神经元和突触都密集的有效卷积可以轻松地映射到现有的密集CNN加速器。与先前以复杂性换取灵活性的方法不同，CCR提倡一种新颖的方法，可从减少计算和内存访问以及加速现有密集型架构（无需进行侵入式PE修改）中获得收益。作为案例研究，我们遵循CCR的基本原理实施了稀疏的CNN加速器SparseK。实验表明，与类似配置的密集架构相比，SparseK在VGG16上实现了2.9倍的加速。与最新的稀疏加速器相比，SparseK可以将性能和能源效率分别提高1.8倍和1.5倍。

著录项

来源
《2018 Design, Automation amp; Test in Europe Conference amp; Exhibition》|2018年|189-194|共6页
会议地点 Dresden(DE)
作者
Jiajun Li; Guihai Yan; Wenyan Lu; Shuhao Jiang; Shijun Gong; Jingya Wu; Xiaowei Li;
展开▼
作者单位

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Convolution; Kernel; Neurons; Synapses; Two dimensional displays; Sparse matrices; Computer architecture;

机译：卷积;内核;神经元;突触;二维显示;稀疏矩阵;计算机体系结构;;

相似文献

外文文献
中文文献
专利

1. SqueezeFlow: A Sparse CNN Accelerator Exploiting Concise Convolution Rules [J] . Li Jiajun, Jiang Shuhao, Gong Shijun, IEEE Transactions on Computers . 2019,第11期

机译：SqueezeFlow：利用简洁卷积规则的稀疏CNN加速器
2. WinoNN: Optimizing FPGA-Based Convolutional Neural Network Accelerators Using Sparse Winograd Algorithm [J] . Wang Xuan, Wang Chao, Cao Jing, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2020,第11期

机译：WINONN：使用稀疏Winograd算法优化基于FPGA的卷积神经网络加速器
3. SENTEI: Filter-Wise Pruning with Distillation towards Efficient Sparse Convolutional Neural Network Accelerators [J] . Masayuki SHIMODA, Youki SADA, Ryosuke KURAMOCHI, IEICE transactions on information and systems . 2020,第12期

机译：Sentei：通过蒸馏到有效的稀疏卷积神经网络加速器的滤波器 - 明智的修剪
4. CCR: A concise convolution rule for sparse neural network accelerators [C] . Jiajun Li, Guihai Yan, Wenyan Lu, Design, Automation Test in Europe Conference Exhibition . 2018

机译：CCR：稀疏神经网络加速器的简明卷积规则
5. FPGA-based Accelerators for Convolutional Neural Networks on Embedded Devices [D] . Perera Miro, Jordi. 2020

机译：基于FPGA的嵌入式设备卷积神经网络的加速器
6. Rule Extraction From Binary Neural Networks With Convolutional Rules for Model Validation [O] . Sophie Burkhardt, Jannis Brugger, Nicolas Wagner, 2021

机译：二元神经网络与模型验证卷积规则的规则提取
7. NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps [O] . Aimar, Alessandro, Mostafa, Hesham, Calabrese, Enrico, 2017

机译：NullHop：一种基于FpGa的灵活卷积神经网络加速器特征映射的稀疏表示

CCR: A concise convolution rule for sparse neural network accelerators

摘要

著录项

相似文献

相关主题

期刊订阅