首页> 外文期刊>IEEE Transactions on Computers >NNPIM: A Processing In-Memory Architecture for Neural Network Acceleration
【24h】

NNPIM: A Processing In-Memory Architecture for Neural Network Acceleration

机译:NNPIM:用于神经网络加速的处理中内存架构

获取原文
获取原文并翻译 | 示例

摘要

Neural networks (NNs) have shown great ability to process emerging applications such as speech recognition, language recognition, image classification, video segmentation, and gaming. It is therefore important to make NNs efficient. Although attempts have been made to improve NNs' computation cost, the data movement between memory and processing cores is the main bottleneck for NNs' energy consumption and execution time. This makes the implementation of NNs significantly slower on traditional CPU/GPU cores. In this paper, we propose a novel processing in-memory architecture, called NNPIM, that significantly accelerates neural network's inference phase inside the memory. First, we design a crossbar memory architecture that supports fast addition, multiplication, and search operations inside the memory. Second, we introduce simple optimization techniques which significantly improves NNs' performance and reduces the overall energy consumption. We also map all NN functionalities using parallel in-memory components. To further improve the efficiency, our design supports weight sharing to reduce the number of computations in memory and consecutively speedup NNPIM computation. We compare the efficiency of our proposed NNPIM with GPU and the state-of-the-art PIM architectures. Our evaluation shows that our design can achieve 131.5x higher energy efficiency and is 48.2x faster as compared to NVIDIA GTX 1,080 GPU architecture. Compared to state-of-the-art neural network accelerators, NNPIM can achieve on an average 3.6x higher energy efficiency and is 4.6x faster, while providing the same classification accuracy.
机译:神经网络(NN)具有处理语音识别,语言识别,图像分类,视频分割和游戏等新兴应用的强大能力。因此,重要的是使神经网络高效。尽管已经进行了一些尝试来提高NN的计算成本,但是内存和处理核心之间的数据移动是NN能耗和执行时间的主要瓶颈。这使得NN在传统CPU / GPU内核上的实现速度大大降低。在本文中,我们提出了一种称为NNPIM的新颖的内存处理架构,该架构显着加快了内存中神经网络的推理阶段。首先,我们设计一种纵横制存储器架构,该架构支持存储器内部的快速加法,乘法和搜索操作。其次,我们介绍了简单的优化技术,可显着提高神经网络的性能并降低总体能耗。我们还使用并行内存组件映射了所有NN功能。为了进一步提高效率,我们的设计支持权重共享,以减少内存中的计算数量并连续加速NNPIM计算。我们将建议的NNPIM与GPU的效率以及最新的PIM架构进行比较。我们的评估表明,与NVIDIA GTX 1,080 GPU架构相比,我们的设计可实现高131.5倍的能源效率,并且快48.2倍。与最先进的神经网络加速器相比,NNPIM可以平均提高3.6倍的能源效率,快4.6倍,同时提供相同的分类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号