首页> 外文期刊>Computer architecture news >SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing
【24h】

SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing

机译:SC-DCNN:使用随机计算的高度可扩展的深度卷积神经网络

获取原文
获取原文并翻译 | 示例

摘要

With the recent advance of wearable devices and Internet of Things (IoTs), it becomes attractive to implement the Deep Convolutional Neural Networks (DCNNs) in embedded and portable systems. Currently, executing the software-based DCNNs requires high-performance servers, restricting the widespread deployment on embedded and mobile IoT devices. To overcome this obstacle, considerable research efforts have been made to develop highly-parallel and specialized DCNN accelerators using GPGPUs, FPGAs or ASICs. Stochastic Computing (SC), which uses a bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has high potential for implementing DCNNs with high scalability and ultra-low hardware footprint. Since multiplications and additions can be calculated using AND gates and multiplexers in SC, significant reductions in power (energy) and hardware footprint can be achieved compared to the conventional binary arithmetic implementations. The tremendous savings in power (energy) and hardware resources allow immense design space for enhancing scalability and robustness for hardware DCNNs. This paper presents SC-DCNN, the first comprehensive design and optimization framework of SC-based DCNNs, using a bottom-up approach. We first present the designs of function blocks that perform the basic operations in DCNN, including inner product, pooling, and activation function. Then we propose four designs of feature extraction blocks, which are in charge of extracting features from input fea- ture maps, by connecting different basic function blocks with joint optimization. Moreover, the efficient weight storage methods are proposed to reduce the area and power (energy) consumption. Putting all together, with feature extraction blocks carefully selected, SC-DCNN is holistically optimized to minimize area and power (energy) consumption while maintaining high network accuracy. Experimental results demonstrate that the LeNet5 implemented in SC-DCNN consumes only 17 mm2 area and 1.53 W power, achieves throughput of 781250 images/s, area efficiency of 45946 images/s/mm~2, and energy efficiency of 510734 im-ages/J.
机译:随着可穿戴设备和物联网(IoT)的最新发展,在嵌入式和便携式系统中实现深度卷积神经网络(DCNN)变得越来越有吸引力。当前,执行基于软件的DCNN需要高性能服务器,从而限制了在嵌入式和移动IoT设备上的广泛部署。为了克服这一障碍,已经进行了大量的研究工作,以使用GPGPU,FPGA或ASIC开发高度并行的专用DCNN加速器。随机计算(SC)通过使用位流通过计算位流中的个数来表示[-1,1]中的数字,具有实现具有高可扩展性和超低硬件占用量的DCNN的巨大潜力。由于可以使用SC中的AND门和多路复用器来计算乘法和加法运算,因此与传统的二进制算术实现相比,可以显着降低功率(能量)和硬件占用空间。功耗(能源)和硬件资源的巨大节省为扩展硬件DCNN的可扩展性和鲁棒性提供了巨大的设计空间。本文采用自下而上的方法介绍了SC-DCNN,这是第一个基于SC的DCNN的全面设计和优化框架。我们首先介绍功能块的设计,这些功能块在DCNN中执行基本操作,包括内部乘积,合并和激活功能。然后,我们提出了四种特征提取模块设计,通过将不同的基本功能模块与联合优化相结合,负责从输入特征图提取特征。此外,提出了有效的重量存储方法以减少面积和功率(能量)消耗。将所有内容放在一起,并精心选择特征提取模块,对SC-DCNN进行了全面优化,以最大程度地减少面积和功耗(能源)消耗,同时保持较高的网络精度。实验结果表明,在SC-DCNN中实现的LeNet5仅消耗17 mm2的面积和1.53 W的功率,实现781250个图像/秒的吞吐率,45946个图像/ s / mm〜2的面积效率和510734个图像/的能源效率。 J.

著录项

  • 来源
    《Computer architecture news》 |2017年第1期|405-418|共14页
  • 作者单位

    Department of Electrical Engineering and Computer Science, Syracuse University;

    Department of Electrical Engineering and Computer Science, Syracuse University;

    Department of Electrical Engineering and Computer Science, Syracuse University;

    Department of Electrical Engineering and Computer Science, Syracuse University;

    Department of Electrical Engineering and Computer Science, Syracuse University;

    Department of Electrical Engineering, University of Southern California;

    Department of Electrical Engineering, University of Southern California;

    Department of Electrical Engineering, City University of New York, City College;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号