首页> 外文期刊>IEEE Transactions on Computers >PermCNN: Energy-Efficient Convolutional Neural Network Hardware Architecture With Permuted Diagonal Structure
【24h】

PermCNN: Energy-Efficient Convolutional Neural Network Hardware Architecture With Permuted Diagonal Structure

机译:PERMCNN:具有置换对角线结构的节能卷积神经网络硬件架构

获取原文
获取原文并翻译 | 示例

摘要

In the emerging artificial intelligence (AI) era, efficient hardware accelerator design for deep neural networks (DNNs) is very important to enable real-time energy-efficient DNN model deployment. To this end, various DNN model compression approaches and the corresponding hardware architectures have been intensively investigated. Recently, PermDNN, as a permuted diagonal structure-imposing model compression approach, was proposed with promising classification performance and hardware performance. However, the existing PermDNN hardware architecture is specifically designed for fully-connected (FC) layer-contained DNN models; while its support for convolutional (CONV) layer is missing. To fill this gap, this article proposes PermCNN, an energy-efficient hardware architecture for permuted diagonal structured convolutional neural networks (CNNs). By fully utilizing the strong structured sparsity in the trained models as well as dedicatedly leveraging the dynamic activation sparsity, PermCNN delivers very high hardware performance for inference tasks on CNN models. A design example with 28 nm CMOS technology shows that, compared the to state-of-the-art CNN accelerator, PermCNN achieves 3.74x and 3.11x improvement on area and energy efficiency, respectively, on AlexNet workload, and 17.49x and 14.22x improvement on area and energy efficiency, respectively, on VGG model. After including energy consumption incurred by DRAM access, PermCNN achieves 2.60x and 9.62x overall energy consumption improvement on AlexNet and VGG workloads, respectively.
机译:在新兴人工智能(AI)时代,深度神经网络(DNN)的高效硬件加速器设计对于实现实时节能DNN模型部署非常重要。为此,已经密集地研究了各种DNN模型压缩方法和相应的硬件架构。最近,提出了具有有前途的分类性能和硬件性能的允许对角线结构施加模型压缩方法的PERMDNN。但是,现有的PERMDNN硬件架构专为完全连接(FC)层包含的DNN型号而设计;虽然其对卷积(CONV)层的支持缺失。为了填补这一差距,本文提出了一种用于允许对角线结构卷积神经网络(CNNS)的节能硬件架构的Permcnn。通过充分利用训练型模型中的强大结构稀疏性以及致力于利用动态激活稀疏性,PermCNN在CNN型号上提供了对推理任务的非常高的硬件性能。具有28个NM CMOS技术的设计示例表明,与最先进的CNN加速器相比,PERMCNN分别在AlexNet工作负载和17.49x和14.22x上实现了3.74倍和3.11倍的区域和能效。 vgg模型分别改善面积和能效。在包括DRAM Access的能耗之后,Permcnn分别在AlexNet和VGG工作负载上实现了2.60倍和9.62倍的总能耗改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号