首页> 外文会议>Workshop on Energy Efficient Machine Learning and Cognitive Computing - NeurIPS Edition >Towards Co-designing Neural Network Function Approximators with In-SRAM Computing
【24h】

Towards Co-designing Neural Network Function Approximators with In-SRAM Computing

机译:朝着SRAM计算共同设计神经网络函数近似器

获取原文

摘要

We propose a co-design approach for compute-in-memory inference for deep neural networks (DNN). We use multiplication-free function approximators based on $l1$ norm along with a co-adapted processing array and compute flow. Using the approach, we overcame many deficiencies in the current art of in-SRAM DNN processing, such as the need for DACs at each operating SRAM row/column, high precision ADCs, and limited support for multi-bit precision weights, and limited vector-scale parallelism. We also propose an SRAM-immersed successive approximation ADC (SA-ADC). We exploit the parasitic capacitance of bit lines of SRAM array as a capacitive DAC, allowing low area implementation of within-SRAM SA-ADC. Our $8imes 62$ SRAM macro requires a 5-bit ADC, achieves 105TOPS/W with 8-bit input/weight processing at 45 nm CMOS. We evaluated the performance of our proposed network for MNIST, CIFAR10, and CIFAR100 datasets. We chose a network configuration which adaptively mixes multiplication-free and regular operators. The network configurations utilize the multiplication-free operator for more than 85% operations from the total. The selected configurations are 98.6% accurate for MNIST, 90.2% for CIFAR10, and 66.9% for CIFAR100. Since most of the operations in the considered configurations are based on proposed SRAM macros, our compute-in-memory's efficiency benefits broadly translate to the system-level.
机译:我们为深神经网络(DNN)提出了一种计算内存计算的共同设计方法。我们使用乘法函数近似器基于 $ l1 $ 规范以及共同适应的处理阵列和计算流程。使用该方法,我们克服了许多缺乏在SRAM DNN处理的当前艺术中,例如在每个操作SRAM行/列,高精度ADC的DAC的需要,以及用于多比特精密重量和有限载体的受限支持和有限的向量-scalepartpleatom。我们还提出了一个沉浸式连续近似ADC(SA-ADC)。我们利用SRAM阵列的位线作为电容DAC的寄生电容,允许SRAM SA-ADC内的低区域实现。我们的 $ 8 times 62 $ SRAM宏需要5位ADC,以45 nm CMOS实现105TOP / W,具有8位输入/重量处理。我们评估了我们为Mnist,CIFAR10和CIFAR100数据集的提出网络的表现。我们选择了一个自适应混合乘法和常规操作员的网络配置。网络配置利用无乘法运算符从总数中获得超过85%的操作。对于MNIST,所选配置为98.6%,对于CIFAR10,90.2%,CIFAR100为66.9%。由于所考虑配置中的大多数操作基于所提出的SRAM宏,因此我们的计算内存的效率优势广泛地转化为系统级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号