首页> 外文会议>IEEE International Solid- State Circuits Conference >15.4 A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization
【24h】

15.4 A 5.99-to-691.1TOPS/W Tensor-Train In-Memory-Computing Processor Using Bit-Level-Sparsity-Based Optimization and Variable-Precision Quantization

机译:15.4使用基于比特级稀疏性的优化和可变精度量化的5.99至691.1TOP / W TensoR-Trount-Comput-Computing处理器

获取原文

摘要

Computing-in-memory (CIM) improves energy efficiency by enabling parallel multiply-and-accumulate (MAC) operations and reducing memory accesses [1 –4]. However, today’s typical neural networks (NNs) usually exceed on-chip memory capacity. Thus, a CIM-based processor may encounter a memory bottleneck [5]. Tensor-train (TT) is a tensor decomposition method, which decomposes a d-dimensional tensor to d 4D tensor-cores $left(operatorname{TCs}: G_{k}left[r_{k-1}, n_{k}, m_{k}, r_{k}ight], k=1, ldots, dight)$ [6]. $G_{k}$ can be viewed as a $2D n_{k} imes m_{k}$ array, where each element is an $r_{k-1} imes r_{k}$ matrix. The TCs require $Sigma_{k in[1, d]} r_{k-1} n_{k} m_{k} r_{k}$ parameters to represent the original tensor, which has $Pi_{mathrm{k} in[1, mathrm{d}]} mathrm{n}_{mathrm{k}} mathrm{m}_{mathrm{k}}$ parameters. Since rk is typically small, kernels and weight matrices of convolutional, fully-connected and recurrent layers can be compressed significantly by using TT decomposition, thereby enabling storage of an entire NN in a CIM-based processor.
机译:计算式存储器(CIM)通过使并行乘法和累加(MAC)运算和减少存储器访问[1-4]提高能量效率。然而,今天的典型神经网络(神经网络),一般不超过片上存储器容量。因此,基于CIM-处理器可能会遇到一个存储器瓶颈[5]。张量列车(TT)是一种张量分解方法,其分解的d维张量d 4D张量型磁芯$ 左( operatorname {的TC}:G_ {K} 左[R_ {K-1},N_ {K},M_ {K},R_ {K} 右],k = 1时, ldots,d 右)$ [6]。 $ G_ {K} $可以被看作是一个$ 2D N_ {K} 倍M_ {K} $阵列,其中每个元素是一个$ R_ {K-1} 倍R_ {K} $矩阵。对TCS要求R_ {K-1} N_ {K} M_ {K} R_ {K} $参数$ Sigma_ {[1,d]ķ}来表示原始张量,其具有$ 裨_ { mathrm {K} [1, mathrm {d}]} mathrm {N} _ { mathrm {K}} {mathrm米} _ { mathrm {K}} $参数。由于Rķ 通常较小,内核和的权重矩阵的卷积,全连接和复发性层可以显著通过使用TT分解,由此使得整个NN的存储在基于CIM处理器被压缩。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号