首页> 外文会议>International Symposium on Microarchitecture >SCOPE: A Stochastic Computing Engine for DRAM-Based In-Situ Accelerator
【24h】

SCOPE: A Stochastic Computing Engine for DRAM-Based In-Situ Accelerator

机译:范围:用于基于DRAM的进步加速器的随机计算引擎

获取原文

摘要

Memory-centric architecture, which bridges the gap between compute and memory, is considered as a promising solution to tackle the memory wall and the power wall. Such architecture integrates the computing logic and the memory resources close to each other, in order to embrace large internal memory bandwidth and reduce the data movement overhead. The closer the compute and memory resources are located, the greater these benefits become. DRAM-based in-situ accelerators [1] tightly couple processing units to every memory bitline, achieving the maximum benefits among various memory-centric architectures. However, the processing units in such architectures are typically limited to simple functions like AND/OR due to strict area and power overhead constraints in DRAMs, making it difficult to accomplish complex tasks while providing high performance. In this paper, we address the challenge by applying stochastic computing arithmetic to the DRAM-based in-situ accelerator, targeting at the acceleration of error-tolerant applications such as deep learning. In stochastic computing, binary numbers are converted into stochastic bitstreams, which turns integer multiplications into simple bitwise AND operations, but at the expense of larger memory capacity/bandwidth demands. Stochastic computing is a perfect match for the DRAM-based in-situ accelerators because it addresses the in-situ accelerator's low performance problem by simplifying the operations, while leveraging the in-situ accelerator's advantage of large memory capacity/bandwidth. To further boost the performance and compensate for the numerical precision loss, we propose a novel Hierarchical and Hybrid Deterministic (H2D) stochastic computing arithmetic. Finally, we consider quantized deep neural network inference and training applications as a case study. The proposed architecture provides 2.3× improvement in performance per unit area compared with the binary arithmetic baseline, and 3.8× improvement over GPU. The proposed H2D arithmetic contributes 11× performance boost and 60% numerical precision improvement.
机译:存储器为中心的架构,其桥接计算和存储器之间的间隙,被认为是有前途的解决方案,以解决存储器壁和功率壁。这种架构集成的计算逻辑和彼此接近的存储器资源,以接受大的内部存储器带宽和减少数据移动开销。越接近的计算和存储资源的位置,更大的这些好处越大。基于DRAM原位加速器[1]紧密结合处理单元的每一个存储器位线,实现各种存储为中心的架构中最大的收益。然而,在这样的体系结构的处理单元通常限于简单的功能等和/或由于在DRAM中严格面积和功率开销的限制,使得难以完成复杂的任务,同时提供高的性能。在本文中,我们针对应用随机计算算术基于DRAM的原位加速器,在容错应用,如深学习加速度目标的挑战。在随机计算,二进制数被转换成比特流随机,这变成整数乘法成简单的位与运算,但在较大存储容量/带宽需求为代价。随机计算是基于DRAM的原位加速器的完美匹配,因为它通过简化的操作解决了原位加速器的性能低的问题,同时利用大内存容量/带宽原位加速器的优势。为了进一步升压的数值精度损失的性能和补偿,我们提出了一个新颖的分层和混合确定性(H 2 d)随机计算的运算。最后,我们认为量化深层神经网络推理和培训中的应用作为案例研究。所提出的架构提供2.3×改善每与二进制算术基线相比单位面积的性能,和3.8×改进过的GPU。所提出的^ h 2 d算术有助于11×性能提升和60%的数值精度的改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号