首页> 外文会议>International Symposium on Microarchitecture >Diffy: a Déjà vu-Free Differential Deep Neural Network Accelerator
【24h】

Diffy: a Déjà vu-Free Differential Deep Neural Network Accelerator

机译:Diffy:A张维无差别深度神经网络加速器

获取原文
获取外文期刊封面目录资料

摘要

We show that Deep Convolutional Neural Network (CNN) implementations of computational imaging tasks exhibit spatially correlated values. We exploit this correlation to reduce the amount of computation, communication, and storage needed to execute such CNNs by introducing Diffy, a hardware accelerator that performs Differential Convolution. Diffy stores, communicates, and processes the bulk of the activation values as deltas. Experiments show that, over five state-of-the-art CNN models and for HD resolution inputs, Diffy boosts the average performance by 7.1× over a baseline value-agnostic accelerator [1] and by 1.41× over a state-of-the-art accelerator that processes only the effectual content of the raw activation values [2]. Further, Diffy is respectively 1.83× and 1.36× more energy efficient when considering only the on-chip energy. However, Diffy requires 55% less on-chip storage and 2.5× less off-chip bandwidth compared to storing the raw values using profiled per-layer precisions [3]. Compared to using dynamic per group precisions [4], Diffy requires 32% less storage and 1.43× less off-chip memory bandwidth. More importantly, Diffy provides the performance necessary to achieve real-time processing of HD resolution images with practical configurations. Finally, Diffy is robust and can serve as a general CNN accelerator as it improves performance even for image classification models.
机译:我们表明计算成像任务的深度卷积神经网络(CNN)实现表现出空间相关的值。我们利用这种相关性来减少通过引入差异来执行这些CNN所需的计算,通信和存储量,这是执行差分卷积的硬件加速器。 Diffy商店,通信和处理大部分激活值作为Deltas。实验表明,在五个最先进的CNN模型和高清分辨率输入中,Diffy通过基线值 - 不可知加速器[1]和1.41×在状态下提高7.1×的平均性能。 -Art加速器仅处理原始激活值的有效内容[2]。此外,在考虑片上能量时,差异分别为1.83×和1.36倍。然而,与使用异形的每层精度存储原始值[3]相比,Diffy需要55%的片上存储和2.5倍的片外带宽。与使用动态每组精度相比[4],Diffy需要32%的存储器,1.43倍的片状内存带宽。更重要的是,Diffy提供了实现具有实际配置的高清分辨率图像的实时处理所需的性能。最后,Diffy是强大的,可以用作一般的CNN加速器,因为即使对于图像分类模型而言,它也可以提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号