首页> 外文期刊>Journal of Parallel and Distributed Computing >Cache simulation for irregular memory traffic on multi-core CPUs: Case study on performance models for sparse matrix-vector multiplication
【24h】

Cache simulation for irregular memory traffic on multi-core CPUs: Case study on performance models for sparse matrix-vector multiplication

机译:多核CPU上不规则内存流量的缓存仿真:稀疏矩阵乘法性能模型的案例研究

获取原文
获取原文并翻译 | 示例

摘要

Parallel computations with irregular memory access patterns are often limited by the memory subsystems of multi-core CPUs, though it can be difficult to pinpoint and quantify performance bottlenecks precisely. We present a method for estimating volumes of data traffic caused by irregular, parallel computations on multi-core CPUs with memory hierarchies containing both private and shared caches. Further, we describe a performance model based on these estimates that applies to bandwidth-limited computations. As a case study, we consider two standard algorithms for sparse matrix-vector multiplication, a widely used, irregular kernel. Using three different multi-core CPU systems and a set of matrices that induce a range of irregular memory access patterns, we demonstrate that our cache simulation combined with the proposed performance model accurately quantifies performance bottlenecks that would not be detected using standard best- or worst-case estimates of the data traffic volume.
机译:具有不规则内存访问模式的并行计算通常由多核CPU的存储器子系统的限制,尽管可能难以精确定位和量化性能瓶颈。我们介绍了一种估计由多核CPU上的不规则,并行计算引起的数据流量卷的方法,其中包含包含私有和共享缓存的内存层次结构。此外,我们基于这些估计来描述一种适用于带宽限制的计算的性能模型。作为一个案例研究,我们考虑了两个用于稀疏矩阵矢量乘法的标准算法,广泛使用的,不规则的内核。使用三种不同的多核CPU系统和一组矩阵诱导一系列不规则的内存访问模式,我们证明我们的高速缓存模拟与所提出的性能模型结合准确地量化了不使用标准或最差的标准检测的性能瓶颈 - 数据流量卷的估计值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号