首页> 外文会议>Asia and South Pacific Design Automation Conference >Towards acceleration of deep convolutional neural networks using stochastic computing
【24h】

Towards acceleration of deep convolutional neural networks using stochastic computing

机译:利用随机计算加速深层卷积神经网络

获取原文

摘要

In recent years, Deep Convolutional Neural Network (DCNN) has become the dominant approach for almost all recognition and detection tasks and outperformed humans on certain tasks. Nevertheless, the high power consumptions and complex topologies have hindered the widespread deployment of DCNNs, particularly in wearable devices and embedded systems with limited area and power budget. This paper presents a fully parallel and scalable hardware-based DCNN design using Stochastic Computing (SC), which leverages the energy-accuracy trade-off through optimizing SC components in different layers. We first conduct a detailed investigation of the Approximate Parallel Counter (APC) based neuron and multiplexer-based neuron using SC, and analyze the impacts of various design parameters, such as bit stream length and input number, on the energy/power/area/accuracy of the neuron cell. Then, from an architecture perspective, the influence of inaccuracy of neurons in different layers on the overall DCNN accuracy (i.e., software accuracy of the entire DCNN) is studied. Accordingly, a structure optimization method is proposed for a general DCNN architecture, in which neurons in different layers are implemented with optimized SC components, so as to reduce the area, power, and energy of the DCNN while maintaining the overall network performance in terms of accuracy. Experimental results show that the proposed approach can find a satisfactory DCNN configuration, which achieves 55X, 151X, and 2X improvement in terms of area, power and energy, respectively, while the error is increased by 2.86%, compared with the conventional binary ASIC implementation.
机译:近年来,深度卷积神经网络(DCNN)已成为几乎所有识别和检测任务的主要方法,并且在某些任务上的性能优于人类。然而,高功耗和复杂的拓扑结构阻碍了DCNN的广泛部署,特别是在面积和功耗预算有限的可穿戴设备和嵌入式系统中。本文介绍了一种使用随机计算(SC)的完全并行且可扩展的基于硬件的DCNN设计,该设计通过在不同层中优化SC组件来利用能量准确性的权衡。我们首先使用SC对基于近似并行计数器(APC)的神经元和基于多路复用器的神经元进行详细研究,并分析各种设计参数(例如位流长度和输入数)对能量/功率/面积/的影响神经元细胞的准确性。然后,从架构的角度,研究了不同层中神经元的不准确性对整体DCNN准确性(即整个DCNN的软件准确性)的影响。因此,提出了一种针对一般DCNN架构的结构优化方法,其中通过优化的SC组件实现不同层的神经元,从而在保持整体网络性能的同时,减小DCNN的面积,功率和能量。准确性。实验结果表明,所提出的方法可以找到令人满意的DCNN配置,与传统的二进制ASIC实现相比,其在面积,功率和能量方面分别提高了55倍,151倍和2倍,而误差却增加了2.86%。 。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号