首页> 外文会议>IEEE Custom Integrated Circuits Conference >A 16K Current-Based 8T SRAM Compute-In-Memory Macro with Decoupled Read/Write and 1-5bit Column ADC
【24h】

A 16K Current-Based 8T SRAM Compute-In-Memory Macro with Decoupled Read/Write and 1-5bit Column ADC

机译:具有解耦的读/写和1-5位列ADC的基于16K电流的8T SRAM内存中计算宏

获取原文

摘要

A novel 8T SRAM -based bitcell is proposed for current-based compute-in-memory dot-product operations. The proposed bitcell with two extra NMOS transistors (vs. standard 6T SRAM) decouples SRAM read and write operation. A 128×128 8T SRAM bitcell array is built for processing a vector-matrix multiplication (or parallel dot-products) with 64x binary (0 or 1) inputs, 64×128 binary (-1 or +1) weights, and 128x 1-5bit outputs. Each column (i.e. neuron) of the proposed SRAM compute-in-memory macro consists of 64x bitcells for dot-product, 32x bitcells for ADC, and 32x bitcells for calibration. The column-based neuron minimizes the ADC overhead by reusing a sense amplifier for SRAM read. The column-wise ADC converts the analog dot-product results to N-bit output codes (N=1 to 5) by sweeping reference levels using replica bitcells for 2N-1 cycles for each conversion. Monte-Carlo simulations and test-chip measurement results have verified both linearity and process variation. The largest variation (σ=2.48%) results in the MNIST classification accuracy of 96.2% (i.e. 0.4% lower than a baseline with no variation). A test-chip is fabricated using 65nm, and the 16K SRAM bitcell array occupies 0.055mm2. The energy efficiency of the 1bit operation is 490-to-15.8TOPS/W at 1-5bit ADC mode using 0.45/0.8V core supply and 200MHz.
机译:提出了一种新颖的基于8T SRAM的位单元,用于基于电流的内存中点积运算。所建议的具有两个额外NMOS晶体管(相对于标准6T SRAM)的位单元使SRAM的读写操作解耦。构建了128×128 8T SRAM位单元阵列,用于处理具有64x二进制(0或1)输入,64×128二进制(-1或+1)权重和128x 1的矢量矩阵乘法(或并行点积)。 -5bit输出。所建议的SRAM内存中计算宏的每一列(即神经元)由用于点积的64x位单元,用于ADC的32x位单元和用于校准的32x位单元组成。基于列的神经元通过将检测放大器重新用于SRAM读取,从而最大程度地降低了ADC开销。逐行ADC通过使用复制位单元对参考电平进行扫描(每次转换2N-1个周期),将模拟点积结果转换为N位输出代码(N = 1至5)。蒙特卡洛仿真和测试芯片测量结果已经验证了线性和工艺变化。最大变化(σ= 2.48%)导致MNIST分类准确度为96.2%(即比没有变化的基线低0.4%)。使用65nm制作测试芯片,而16K SRAM位单元阵列占用0.055mm 2 。在1-5位ADC模式下,使用0.45 / 0.8V内核电源和200MHz,1位操作的能效为490至15.8TOPS / W。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号