首页> 外文会议>2018 Design, Automation amp; Test in Europe Conference amp; Exhibition >CCR: A concise convolution rule for sparse neural network accelerators
【24h】

CCR: A concise convolution rule for sparse neural network accelerators

机译:CCR:稀疏神经网络加速器的简洁卷积规则

获取原文
获取原文并翻译 | 示例

摘要

Convolutional Neural networks (CNNs) have achieved great success in a broad range of applications. As CNN-based methods are often both computation and memory intensive, sparse CNNs have emerged as an effective solution to reduce the amount of computation and memory accesses while maintaining the high accuracy. However, dense CNN accelerators can hardly benefit from the reduction of computations and memory accesses due to the lack of support for irregular and sparse models. This paper proposed a concise convolution rule (CCR) to diminish the gap between sparse CNNs and dense CNN accelerators. CCR transforms a sparse convolution into multiple effective and ineffective ones. The ineffective convolutions in which either the neurons or synapses are all zeros do not contribute to the final results and the computations and memory accesses can be eliminated. The effective convolutions in which both the neurons and synapses are dense can be easily mapped to the existing dense CNN accelerators. Unlike prior approaches which trade complexity for flexibility, CCR advocates a novel approach to reaping the benefits from the reduction of computation and memory accesses as well as the acceleration of the existing dense architectures without intrusive PE modifications. As a case study, we implemented a sparse CNN accelerator, SparseK, following the rationale of CCR. The experiments show that SparseK achieved a speedup of 2.9× on VGG16 compared to a comparably provisioned dense architecture. Compared with state-of-the-art sparse accelerators, SparseK can improve the performance and energy efficiency by 1.8× and 1.5×, respectively.
机译:卷积神经网络(CNN)在广泛的应用中都取得了巨大的成功。由于基于CNN的方法通常都需要大量的计算和内存,因此稀疏的CNN成为一种有效的解决方案,可以减少计算和内存访问量,同时又能保持较高的准确性。但是,由于缺乏对不规则和稀疏模型的支持,密集的CNN加速器几乎无法从减少计算和访问内存中受益。本文提出了一种简洁的卷积规则(CCR),以缩小稀疏的CNN与密集的CNN加速器之间的距离。 CCR将稀疏卷积转换为多个有效卷积和无效卷积。神经元或突触全为零的无效卷积不会影响最终结果,因此可以消除计算和内存访问。神经元和突触都密集的有效卷积可以轻松地映射到现有的密集CNN加速器。与先前以复杂性换取灵活性的方法不同,CCR提倡一种新颖的方法,可从减少计算和内存访问以及加速现有密集型架构(无需进行侵入式PE修改)中获得收益。作为案例研究,我们遵循CCR的基本原理实施了稀疏的CNN加速器SparseK。实验表明,与类似配置的密集架构相比,SparseK在VGG16上实现了2.9倍的加速。与最新的稀疏加速器相比,SparseK可以将性能和能源效率分别提高1.8倍和1.5倍。

著录项

  • 来源
  • 会议地点 Dresden(DE)
  • 作者单位

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, University of Chinese Academy of Sciences;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Convolution; Kernel; Neurons; Synapses; Two dimensional displays; Sparse matrices; Computer architecture;

    机译:卷积;内核;神经元;突触;二维显示;稀疏矩阵;计算机体系结构;;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号