首页> 外文会议>IEEE International Solid- State Circuits Conference >14.2 A Compute SRAM with Bit-Serial Integer/Floating-Point Operations for Programmable In-Memory Vector Acceleration
【24h】

14.2 A Compute SRAM with Bit-Serial Integer/Floating-Point Operations for Programmable In-Memory Vector Acceleration

机译:14.2具有位串行整数/浮点运算的计算SRAM,用于可编程的向量内存内加速

获取原文

摘要

Data movement and memory bandwidth are dominant factors in the energy and performance of both general purpose CPUs and GPUs. This has led to extensive research focused on in-memory computing, which moves computation to where the data is located. With this approach, computation is often performed on the memory bit-lines in the analog domain using current summing [1]-[3], which requires expensive analog-to-digital and digital-to-analog conversions at the array boundary. In addition, such analog computation is very sensitive to PVT variations, limiting precision. More recently, full-rail (digital) binary in-memory computing was proposed to avoid this conversion overhead and improve robustness [4], [5]. However, both prior in-memory approaches suffer from the same major limitations: they accelerate only one type of algorithm and are inherently restricted to a very specific application domain due to their limited and fixed bit-width precision and non-programmable architecture. Software algorithms, on the other hand, continue to evolve rapidly, especially in novel application domains, such as neural networks, vision and graph processing, making rigid accelerators of limited use. Furthermore, most available SRAM in today's chips is located in the caches of CPUs or GPUs. These large CPU and GPU SRAM stores present an opportunity for extensive in memory computing and have, to date, remained largely untapped.
机译:数据移动和内存带宽是通用CPU和GPU的能耗和性能的主要因素。这导致了针对内存计算的广泛研究,该计算将计算移至数据所在的位置。通过这种方法,通常使用电流求和[1]-[3]在模拟域中的存储位线上执行计算,这需要在阵列边界进行昂贵的模数和数模转换。此外,这种模拟计算对PVT变化非常敏感,从而限制了精度。最近,提出了全轨(数字)二进制内存计算来避免这种转换开销并提高鲁棒性[4],[5]。但是,这两种现有的内存方法都具有相同的主要局限性:它们仅加速一种类型的算法,并且由于其有限的固定位宽精度和不可编程的体系结构而固有地局限于特定的应用领域。另一方面,软件算法继续快速发展,特别是在诸如神经网络,视觉和图形处理之类的新颖应用领域中,使得刚性加速器的使用受到限制。此外,当今芯片中大多数可用的SRAM位于CPU或GPU的缓存中。这些大型的CPU和GPU SRAM存储为扩展内存计算提供了机会,并且到目前为止,它们仍未开发。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号