首页> 外文会议>2019 Spring Simulation Conference >Systolic Sparse Matrix Vector Multiply in the Age of TPUs and Accelerators
【24h】

Systolic Sparse Matrix Vector Multiply in the Age of TPUs and Accelerators

机译:TPU和加速器时代的收缩期稀疏矩阵向量乘

获取原文
获取原文并翻译 | 示例

摘要

Tensor Processing Units has brought back systolic arrays as a computational alternative to high performance computing. Recently Google presented a Tensor Processing Unit for handling matrix multiplication using systolic arrays. This unit is designed for dense matrices only. As they stated, sparse architectural support was omitted momentarily but they will focus on sparsity in future designs. We propose a systolic array to compute the Sparse Matrix Vector product in T2(n) ≈ [nnz/2] + 2n + 2 using 2n + 2 processing elements. The systolic array we propose also use accumulators to collect the partial results of the resulting vector and supports adapting tiling.
机译:Tensor Processing Units带回了脉动阵列作为高性能计算的一种计算替代方法。最近,Google提供了一个Tensor处理单元,用于使用脉动阵列处理矩阵乘法。此单元仅适用于密集矩阵。正如他们所说,稀疏的架构支持暂时被省略了,但是他们将专注于未来设计中的稀疏性。我们提出一个脉动阵列,使用2n + 2个处理元素来计算T2(n)≈[nnz / 2] + 2n + 2中的稀疏矩阵矢量积。我们建议的脉动阵列也使用累加器来收集所得向量的部分结果,并支持自适应平铺。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号