首页> 外文期刊>IEEE Transactions on Computers >SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives
【24h】

SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives

机译:备件:使用ROM嵌入式RAM作为内存中计算基元来增强神经网络加速

获取原文
获取原文并翻译 | 示例

摘要

From the little we know about the human brain, the inherent cognitive mechanism is very different from the de facto state-of-the-art computing platforms. The human brain uses distributed, yet integrated memory and computation units, unlike the physically separate memory and computation cores in typical von Neumann architectures. Despite huge success of artificial intelligence, hardware systems running these algorithms consume orders of magnitude higher energy compared to the human brain, mainly due to heavy data movements between the memory unit and the computation cores. Spiking neural networks (SNNs) built using bio-plausible neuron and synaptic models have emerged as the power efficient choice for designing cognitive applications. These algorithms involve several lookup-table (LUT) based function evaluations such as high-order polynomials and transcendental functions for solving complex neuro-synaptic models, that typically require additional storage and thus, bigger memories. To that effect, we propose 'SPARE'-an in-memory, distributed processing architecture built on ROM-embedded RAM technology, for accelerating SNNs. ROM-embedded RAMs allow storage of LUTs (for neuro-synaptic models), embedded within a typical memory array, without additional area overhead. Our proposed architecture consists of a 2-D array of Processing Elements (PEs), wherein each PE has its own ROM-embedded RAM structure and executes part of the SNN computation. Since most of the computations (including multiple math-table evaluations) are done locally within each PE, unnecessary data transfers are restricted, thereby alleviating the problems arising due to physically separate remote memory unit and the computation core. SPARE thus leverages both, the hardware benefits of distributed, in-memory processing, and also the algorithmic benefits of SNNs. We evaluate SPARE for two different ROM-Embedded RAM structures-CMOS based ROM-Embedded SRAMs (R-SRAMs) and STT-MRAM based ROM-Embedded MRAMs (R-MRAMs). Moreover, we analyze trade-offs in terms of energy, area and performance, for using the two technologies on a range of image classification benchmarks. Furthermore, we leverage the additional storage density to implement complex neuro-synaptic functionalities. This enhances the utility of the proposed architecture by provisioning implementation of any neuron/synaptic behavior as necessitated by the application. Our results show up-to similar to 1.75x, similar to 1.95x and similar to 1.95x improvement in energy, iso-storage area, and iso-area performance, respectively, by using neural network accelerators built on ROM-embedded RAM primitives.
机译:从我们对人脑的了解很少,固有的认知机制与事实上的最先进的计算平台有很大的不同。人脑使用分布式但集成的内存和计算单元,这与典型的冯·诺依曼体系结构中物理上分开的内存和计算核心不同。尽管人工智能取得了巨大成功,但运行这些算法的硬件系统与人脑相比仍要消耗几个数量级的能量,这主要是由于存储单元和计算核心之间的大量数据移动所致。使用生物似的神经元和突触模型构建的尖峰神经网络(SNN)已成为设计认知应用程序的高效选择。这些算法涉及几种基于查找表(LUT)的功能评估,例如用于解决复杂的神经突触模型的高阶多项式和先验功能,这些功能通常需要额外的存储空间,因此需要更大的内存。为此,我们提出了“ SPARE”(一种内存内置,分布式处理架构,该架构基于ROM嵌入的RAM技术构建),用于加速SNN。 ROM嵌入式RAM允许将LUT(用于神经突触模型)存储在典型的存储阵列中,而不会产生额外的区域开销。我们提出的体系结构由一个二维处理元素(PE)阵列组成,其中每个PE都有其自己的ROM嵌入式RAM结构并执行SNN计算的一部分。由于大多数计算(包括多个数学表评估)是在每个PE内本地完成的,因此不必要的数据传输受到限制,从而减轻了由于物理上分离的远程存储单元和计算核心而引起的问题。因此,SPARE既利用了分布式内存处理的硬件优势,又利用了SNN的算法优势。我们评估两种不同ROM嵌入式RAM结构的SPARE-基于CMOS的ROM嵌入式SRAM(R-SRAM)和基于STT-MRAM的ROM嵌入式MRAM(R-MRAM)。此外,我们在能源,面积和性能方面权衡取舍,以便在一系列图像分类基准上使用这两种技术。此外,我们利用额外的存储密度来实现复杂的神经突触功能。通过提供应用程序所需的任何神经元/突触行为的实现,这可以提高所建议体系结构的实用性。我们的结果表明,通过使用基于ROM嵌入式RAM原语构建的神经网络加速器,分别在能量,等效存储区域和等效区域性能方面分别达到了1.75倍,1.95倍和1.95倍的提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号