SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives

Agrawal Amogh; Ankit Aayush; Roy Kaushik

首页> 外文期刊>IEEE Transactions on Computers >SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives

【24h】

SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives

机译：备件：使用ROM嵌入式RAM作为内存中计算基元来增强神经网络加速

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

From the little we know about the human brain, the inherent cognitive mechanism is very different from the de facto state-of-the-art computing platforms. The human brain uses distributed, yet integrated memory and computation units, unlike the physically separate memory and computation cores in typical von Neumann architectures. Despite huge success of artificial intelligence, hardware systems running these algorithms consume orders of magnitude higher energy compared to the human brain, mainly due to heavy data movements between the memory unit and the computation cores. Spiking neural networks (SNNs) built using bio-plausible neuron and synaptic models have emerged as the power efficient choice for designing cognitive applications. These algorithms involve several lookup-table (LUT) based function evaluations such as high-order polynomials and transcendental functions for solving complex neuro-synaptic models, that typically require additional storage and thus, bigger memories. To that effect, we propose 'SPARE'-an in-memory, distributed processing architecture built on ROM-embedded RAM technology, for accelerating SNNs. ROM-embedded RAMs allow storage of LUTs (for neuro-synaptic models), embedded within a typical memory array, without additional area overhead. Our proposed architecture consists of a 2-D array of Processing Elements (PEs), wherein each PE has its own ROM-embedded RAM structure and executes part of the SNN computation. Since most of the computations (including multiple math-table evaluations) are done locally within each PE, unnecessary data transfers are restricted, thereby alleviating the problems arising due to physically separate remote memory unit and the computation core. SPARE thus leverages both, the hardware benefits of distributed, in-memory processing, and also the algorithmic benefits of SNNs. We evaluate SPARE for two different ROM-Embedded RAM structures-CMOS based ROM-Embedded SRAMs (R-SRAMs) and STT-MRAM based ROM-Embedded MRAMs (R-MRAMs). Moreover, we analyze trade-offs in terms of energy, area and performance, for using the two technologies on a range of image classification benchmarks. Furthermore, we leverage the additional storage density to implement complex neuro-synaptic functionalities. This enhances the utility of the proposed architecture by provisioning implementation of any neuron/synaptic behavior as necessitated by the application. Our results show up-to similar to 1.75x, similar to 1.95x and similar to 1.95x improvement in energy, iso-storage area, and iso-area performance, respectively, by using neural network accelerators built on ROM-embedded RAM primitives.

机译：从我们对人脑的了解很少，固有的认知机制与事实上的最先进的计算平台有很大的不同。人脑使用分布式但集成的内存和计算单元，这与典型的冯·诺依曼体系结构中物理上分开的内存和计算核心不同。尽管人工智能取得了巨大成功，但运行这些算法的硬件系统与人脑相比仍要消耗几个数量级的能量，这主要是由于存储单元和计算核心之间的大量数据移动所致。使用生物似的神经元和突触模型构建的尖峰神经网络（SNN）已成为设计认知应用程序的高效选择。这些算法涉及几种基于查找表（LUT）的功能评估，例如用于解决复杂的神经突触模型的高阶多项式和先验功能，这些功能通常需要额外的存储空间，因此需要更大的内存。为此，我们提出了“ SPARE”（一种内存内置，分布式处理架构，该架构基于ROM嵌入的RAM技术构建），用于加速SNN。 ROM嵌入式RAM允许将LUT（用于神经突触模型）存储在典型的存储阵列中，而不会产生额外的区域开销。我们提出的体系结构由一个二维处理元素（PE）阵列组成，其中每个PE都有其自己的ROM嵌入式RAM结构并执行SNN计算的一部分。由于大多数计算（包括多个数学表评估）是在每个PE内本地完成的，因此不必要的数据传输受到限制，从而减轻了由于物理上分离的远程存储单元和计算核心而引起的问题。因此，SPARE既利用了分布式内存处理的硬件优势，又利用了SNN的算法优势。我们评估两种不同ROM嵌入式RAM结构的SPARE-基于CMOS的ROM嵌入式SRAM（R-SRAM）和基于STT-MRAM的ROM嵌入式MRAM（R-MRAM）。此外，我们在能源，面积和性能方面权衡取舍，以便在一系列图像分类基准上使用这两种技术。此外，我们利用额外的存储密度来实现复杂的神经突触功能。通过提供应用程序所需的任何神经元/突触行为的实现，这可以提高所建议体系结构的实用性。我们的结果表明，通过使用基于ROM嵌入式RAM原语构建的神经网络加速器，分别在能量，等效存储区域和等效区域性能方面分别达到了1.75倍，1.95倍和1.95倍的提高。

著录项

来源
《IEEE Transactions on Computers》 |2019年第8期|1190-1200|共11页
作者
Agrawal Amogh; Ankit Aayush; Roy Kaushik;
展开▼
作者单位

Purdue Univ Sch Elect & Comp Engn W Lafayette IN 47907 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Spiking neural network (SNN); ROM-embedded RAM; STT-MRAM; in-memory computing;

机译：尖峰神经网络（SNN）;ROM嵌入式RAM STT-MRAM;内存计算;

相似文献

外文文献
中文文献
专利

1. Streaming parallel GPU acceleration of large-scale filter-based spiking neural networks [J] . LESZEK SLAZYNSKI, SANDER BOHTE Network . 2012,第1a4期

机译：基于大规模滤波器的尖峰神经网络的流式并行GPU加速
2. Neuro-RAM Unit with Applications to Similarity Testing and Compression in Spiking Neural Networks [J] . Nancy Lynch, Cameron Musco, Merav Parter LIPIcs : Leibniz International Proceedings in Informatics . 2017,第1期

机译：Neuro-RAM单元在尖峰神经网络中的相似性测试和压缩中的应用
3. Non-Parametric Spiking Neural Network Modelling of the Eye-Movement Response to Enforced Controlled Accelerations [J] . Vladislav Prud, Arthur Mukhamedov, Ernest Sleptsov, IFAC PapersOnLine . 2021,第13期

机译：非参数尖峰神经网络建模对强制控制加速的眼球运动响应
4. Acceleration of spiking neural networks in emerging multi-core and GPU architectures [C] . Bhuiyan M.A., Pallipuram V.K., Smith M.C. 2010 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum . 2010

机译：新兴的多核和GPU架构中尖峰神经网络的加速
5. Acceleration of spiking neural networks on single-GPU and multi-GPU systems. [D] . Pallipuram Krishnamani, Venkittaraman Vivek. 2010

机译：在单GPU和多GPU系统上加速尖峰神经网络。
6. Probabilistic Spike Propagation for Efficient Hardware Implementation of Spiking Neural Networks [O] . Abinand Nallathambi, Sanchari Sen, Anand Raghunathan, 2021

机译：尖峰神经网络有效硬件实现的概率尖峰传播
7. SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives [O] . Amogh Agrawal, Aayush Ankit, Kaushik Roy 2019

机译：备用：使用ROM嵌入式RAM飙升神经网络加速作为内存 - 计算原语

SPARE: Spiking Neural Network Acceleration Using ROM-Embedded RAMs as In-Memory-Computation Primitives

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅