首页> 外文会议>Symposium on VLSI Circuits >CHIMERA: A 0.92 TOPS, 2.2 TOPS/W Edge AI Accelerator with 2 MByte On-Chip Foundry Resistive RAM for Efficient Training and Inference
【24h】

CHIMERA: A 0.92 TOPS, 2.2 TOPS/W Edge AI Accelerator with 2 MByte On-Chip Foundry Resistive RAM for Efficient Training and Inference

机译:嵌合体:一个0.92顶部,2.2顶部/ W边缘AI加速器,带有2 MB的片上铸造电阻RAM,用于高效培训和推理

获取原文

摘要

CHIMERA is the first non-volatile deep neural network (DNN) chip for edge AI training and inference using foundry on-chip resistive RAM (RRAM) macros and no off-chip memory. CHIMERA achieves 0.92 TOPS peak performance and 2.2 TOPS/W. We scale inference to 6x larger DNNs by connecting 6 CHIMERAs with just 4% execution time and 5% energy costs, enabled by communication-sparse DNN mappings that exploit RRAM non-volatility through quick chip wakeup/shutdown (33 µs). We demonstrate the first incremental edge AI training which overcomes RRAM write energy, speed, and endurance challenges. Our training achieves the same accuracy as traditional algorithms with up to 283x fewer RRAM weight update steps and 340x better energy-delay product. We thus demonstrate 10 years of 20 samples/minute incremental edge AI training on CHIMERA.
机译:Chimera是第一种非易失性深神经网络(DNN)芯片,用于边缘AI培训和推断使用铸造片上电阻RAM(RRAM)宏和无芯片内存。 嵌合体达到0.92顶级峰值性能,2.2顶/瓦。 通过连接6%的执行时间和5%的能量成本,通过连接稀疏DNN映射,通过快速芯片唤醒/关机(33μs)来实现,通过将6%的执行时间和5%的能量成本连接到6倍的DNN,通过快速芯片唤醒/关闭(33μs)来扩展到6x更大的DNN。 我们展示了第一次增量边缘AI培训,其克服了RRAM编写能源,速度和耐力挑战。 我们的培训实现了与传统算法相同的准确性,RRAM重量更新步骤和340倍更好的能量延迟产品,高达283倍。 因此,我们在嵌合体上展示了10年的20个样本/分钟增量边缘AI训练。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号