【24h】

'In-memory Computing': Accelerating AI Applications

机译:“内存计算”:加速AI应用

获取原文

摘要

In today's computing systems based on the conventional von Neumann architecture, there are distinct memory and processing units. Performing computations results in a significant amount of data being moved back and forth between the physically separated memory and processing units. This costs time and energy, and constitutes an inherent performance bottleneck. It is becoming increasingly clear that for application areas such as AI (and indeed cognitive computing in general), we need to transition to computing architectures in which memory and logic coexist in some form. Brain-inspired neuromorphic computing and the fascinating new area of in-memory computing or computational memory are two key non-von Neumann approaches being researched. A critical requirement in these novel computing paradigms is a very-high-density, low-power, variable-state, programmable and non-volatile nanoscale memory device. There are many examples of such nanoscale memory devices in which the information is stored either as charge or as resistance. However, one particular example is phase-change-memory (PCM) devices, which are very well suited to address this need, owing to their multi-level storage capability and potential scalability. In in-memory computing, the physics of the nanoscale memory devices, as well as the organization of such devices in cross-bar arrays, are exploited to perform certain computational tasks within the memory unit. I will present how computational memories accelerate AI applications and will show small- and large-scale experimental demonstrations that perform high-level computational primitives, such as ultra-low-power inference engines, optimization solvers including compressed sensing and sparse coding, linear solvers and temporal correlation detection. Moreover. I will discuss the efficacy of this approach to efficiently address not only inferencing but also training of deep neural networks. The results show that this co-existence of computation and storage at the nanometer scale could be the enabler for new, ultra-dense, low-power, and massively parallel computing systems. Thus, by augmenting conventional computing systems, in-memory computing could help achieve orders of magnitude improvement in performance and efficiency.
机译:在当今基于常规冯·诺依曼体系结构的计算系统中,存在不同的存储器和处理单元。执行计算会导致大量数据在物理上分离的内存和处理单元之间来回移动。这花费时间和精力,并且构成固有的性能瓶颈。越来越清楚的是,对于诸如AI(通常是认知计算)之类的应用程序领域,我们需要过渡到内存和逻辑以某种形式共存的计算架构。受大脑启发的神经形态计算以及内存计算或计算记忆的引人入胜的新领域是正在研究的两个关键非冯·诺伊曼方法。在这些新颖的计算范例中的关键要求是非常高密度,低功耗,可变状态,可编程和非易失性的纳米级存储设备。有许多这样的纳米级存储设备的例子,其中信息以电荷或电阻的形式存储。然而,一个特定的例子是相变存储器(PCM)设备,由于其多层存储能力和潜在的可扩展性,它们非常适合满足这一需求。在内存计算中,利用纳米级存储设备的物理特性以及此类设备在交叉阵列中的组织来执行存储单元内的某些计算任务。我将介绍计算存储器如何加速AI应用程序,并展示执行高级计算基元的小型和大型实验演示,例如超低功耗推理引擎,包括压缩感测和稀疏编码的优化求解器,线性求解器和时间相关检测。此外。我将讨论这种方法不仅可以有效解决推理问题,而且可以有效地训练深度神经网络。结果表明,纳米级计算和存储的共存可能是新的,超密度,低功耗和大规模并行计算系统的促成因素。因此,通过扩充常规计算系统,内存中计算可以帮助实现性能和效率的数量级提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号