首页> 外文会议>Symposium on VLSI Circuits >A 7.3 M Output Non-Zeros/J Sparse Matrix-Matrix Multiplication Accelerator using Memory Reconfiguration in 40 nm
【24h】

A 7.3 M Output Non-Zeros/J Sparse Matrix-Matrix Multiplication Accelerator using Memory Reconfiguration in 40 nm

机译:使用40 nm存储器重新配置的7.3 M输出非零/ J稀疏矩阵-矩阵乘法加速器

获取原文

摘要

A Sparse Matrix-Matrix multiplication (SpMM) accelerator with 48 heterogeneous cores and a reconfigurable memory hierarchy is fabricated in 40 nm CMOS. On-chip memories are reconfigured as scratchpad or cache and interconnected with synthesizable coalescing crossbars for efficient memory access in each phase of the algorithm. The 2.0 mm×2.6 mm chip exhibits 12.6× (8.4×) energy efficiency gain, 11.7× (77.6×) off-chip bandwidth efficiency gain and 17.1× (36.9×) compute density gain against a high-end CPU (GPU) across a diverse set of synthetic and real-world power-law graph based sparse matrices.
机译:在40 nm CMOS中制造了具有48个异构内核和可重新配置的存储器层次结构的稀疏矩阵-矩阵乘法(SpMM)加速器。片上存储器被重新配置为暂存器或高速缓存,并与可综合的合并交叉开关互连,以在算法的每个阶段进行有效的存储器访问。 2.0 mm×2.6 mm芯片相对于高端CPU(GPU)表现出12.6倍(8.4倍)的能量效率增益,11.7倍(77.6倍)的片外带宽效率增益和17.1倍(36.9倍)的计算密度增益各种基于稀疏矩阵的合成和真实世界的幂律图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号