首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >iCELIA: A Full-Stack Framework for STT-MRAM-Based Deep Learning Acceleration
【24h】

iCELIA: A Full-Stack Framework for STT-MRAM-Based Deep Learning Acceleration

机译:iCELIA:基于STT-MRAM的深度学习加速的全栈框架

获取原文
获取原文并翻译 | 示例

摘要

A large variety of applications rely on deep learning to process big data, learn sophisticated features, and perform complicated tasks. Utilizing emerging non-volatile memory (NVM)s unique characteristics, including the crossbar array structure and gray-scale cell resistances, to perform neural network (NN) computation is a well-studied approach in accelerating deep learning applications. Compared to other NVM technologies, STT-MRAM has its unique advantages in performing NN computation. However, the state-of-the-art research have not utilized STT-MRAM for deep learning acceleration due to its device- and architecture-level challenges. Consequently, this paper enables STT-MRAM, for the first time, as an effective and practical deep learning accelerator. In particular, it proposes a full-stack framework iCELIA spanning multiple design levels, including device-level fabrication, circuit-level enhancements, architecture-level synaptic weight quantization, and system-level accelerator design. The primary contributions of iCELIA over our prior work CELIA include a new non-uniform weight quantization scheme and much enhanced accelerator system design. The proposed framework significantly mitigates the model accuracy loss due to reduced data precision in a cohesive manner, constructing a comprehensive STT-MRAM accelerator system for fast NN computation with high energy efficiency and low cost.
机译:各种各样的应用程序都依赖于深度学习来处理大数据,学习复杂的功能并执行复杂的任务。利用新兴的非易失性存储器(NVM)的独特特性(包括交叉开关阵列结构和灰度单元电阻)来执行神经网络(NN)计算,是加速深度学习应用的一种经过充分研究的方法。与其他NVM技术相比,STT-MRAM在执行NN计算方面具有其独特的优势。但是,由于其在设备和体系结构方面的挑战,最新的研究尚未将STT-MRAM用于深度学习加速。因此,本文首次使STT-MRAM成为一种有效且实用的深度学习加速器。特别是,它提出了跨多个设计级别的全栈框架iCELIA,包括设备级别的制造,电路级别的增强,体系结构级别的突触权重量化和系统级别的加速器设计。 iCELIA对我们先前的CELIA所做的主要贡献包括新的非均匀权重量化方案和大大增强的加速器系统设计。所提出的框架以内聚的方式显着减轻了由于数据精度降低而导致的模型精度损失,从而构建了一种用于能源效率高,成本低的快速NN计算的综合STT-MRAM加速器系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号