Diffy: a Déjà vu-Free Differential Deep Neural Network Accelerator

机译：Diffy：A张维无差别深度神经网络加速器

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

We show that Deep Convolutional Neural Network (CNN) implementations of computational imaging tasks exhibit spatially correlated values. We exploit this correlation to reduce the amount of computation, communication, and storage needed to execute such CNNs by introducing Diffy, a hardware accelerator that performs Differential Convolution. Diffy stores, communicates, and processes the bulk of the activation values as deltas. Experiments show that, over five state-of-the-art CNN models and for HD resolution inputs, Diffy boosts the average performance by 7.1× over a baseline value-agnostic accelerator [1] and by 1.41× over a state-of-the-art accelerator that processes only the effectual content of the raw activation values [2]. Further, Diffy is respectively 1.83× and 1.36× more energy efficient when considering only the on-chip energy. However, Diffy requires 55% less on-chip storage and 2.5× less off-chip bandwidth compared to storing the raw values using profiled per-layer precisions [3]. Compared to using dynamic per group precisions [4], Diffy requires 32% less storage and 1.43× less off-chip memory bandwidth. More importantly, Diffy provides the performance necessary to achieve real-time processing of HD resolution images with practical configurations. Finally, Diffy is robust and can serve as a general CNN accelerator as it improves performance even for image classification models.

机译：我们表明计算成像任务的深度卷积神经网络（CNN）实现表现出空间相关的值。我们利用这种相关性来减少通过引入差异来执行这些CNN所需的计算，通信和存储量，这是执行差分卷积的硬件加速器。 Diffy商店，通信和处理大部分激活值作为Deltas。实验表明，在五个最先进的CNN模型和高清分辨率输入中，Diffy通过基线值 - 不可知加速器[1]和1.41×在状态下提高7.1×的平均性能。 -Art加速器仅处理原始激活值的有效内容[2]。此外，在考虑片上能量时，差异分别为1.83×和1.36倍。然而，与使用异形的每层精度存储原始值[3]相比，Diffy需要55％的片上存储和2.5倍的片外带宽。与使用动态每组精度相比[4]，Diffy需要32％的存储器，1.43倍的片状内存带宽。更重要的是，Diffy提供了实现具有实际配置的高清分辨率图像的实时处理所需的性能。最后，Diffy是强大的，可以用作一般的CNN加速器，因为即使对于图像分类模型而言，它也可以提高性能。

著录项

来源
《International Symposium on Microarchitecture》|2018年|xxiv 493 p. :|共14页
会议地点
作者
Mostafa Mahmoud; Kevin Siu; Andreas Moshovos;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP302-532;
关键词
Correlation; Imaging; Microsoft Windows; Convolution; Computational modeling; Entropy; Task analysis;

机译：相关性;成像;微软窗户;卷积;计算建模;熵;任务分析;

相似文献

外文文献
中文文献
专利

1. An Overview of Efficient Interconnection Networks for Deep Neural Network Accelerators [J] . Nabavinejad Seyed Morteza, Baharloo Mohammad, Chen Kun-Chih, Emerging and Selected Topics in Circuits and Systems, IEEE Journal on . 2020,第3期

机译：深度神经网络加速器有效互连网络概述
2. Low power & mobile hardware accelerators for deep convolutional neural networks [J] . Scanlan Anthony G. Integration . 2019,第MARa期

机译：用于深度卷积神经网络的低功耗和移动硬件加速器
3. Low power & mobile hardware accelerators for deep convolutional neural networks [J] . Scanlan Anthony G. Integration . 2019,第Mara期

机译：低功耗和移动硬件加速器，用于深卷积神经网络
4. Diffy: a Déjà vu-Free Differential Deep Neural Network Accelerator [C] . Mostafa Mahmoud, Kevin Siu, Andreas Moshovos Annual IEEE/ACM International Symposium on Microarchitecture . 2018

机译：Diffy：无Déjàvu的差分深度神经网络加速器
5. Reducing Off-chip Memory Accesses in Deep Neural Network Accelerators [D] . Siu, Kevin. 2019

机译：减少深度神经网络加速器中的片外存储器访问
6. Differential Evolution Based Layer-Wise Weight Pruning for Compressing Deep Neural Networks [O] . Tao Wu, Xiaoyang Li, Deyun Zhou, 2021

机译：基于差分进化的深层神经网络的层面重量修剪
7. Automated optimization for memory‐efficient high‐performance deep neural network accelerators [O] . HyunMi Kim, Chun‐Gi Lyuh, Youngsu Kwon 2020

机译：内存高效的高性能深度神经网络加速器的自动优化

Diffy: a Déjà vu-Free Differential Deep Neural Network Accelerator

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅