首页> 外文期刊>Journal of Parallel and Distributed Computing >Optimizing nonzero-based sparse matrix partitioning models via reducing latency
【24h】

Optimizing nonzero-based sparse matrix partitioning models via reducing latency

机译:通过减少延迟来优化基于非零的稀疏矩阵分区模型

获取原文
获取原文并翻译 | 示例

摘要

For the parallelization of sparse matrix-vector multiplication (SpMV) on distributed memory systems, nonzero-based fine-grain and medium-grain partitioning models attain the lowest communication volume and computational imbalance among all partitioning models. This usually comes, however, at the expense of high message count, i.e., high latency overhead. This work addresses this shortcoming by proposing new fine-grain and medium-grain models that are able to minimize communication volume and message count in a single partitioning phase. The new models utilize message nets in order to encapsulate the minimization of total message count. We further fine-tune these models by proposing delayed addition and thresholding for message nets in order to establish a trade-off between the conflicting objectives of minimizing communication volume and message count. The experiments on an extensive dataset of nearly one thousand matrices show that the proposed models improve the total message count of the original nonzero-based models by up to 27% on the average, which is reflected on the parallel runtime of SpMV as an average reduction of 15% on 512 processors. (C) 2018 Elsevier Inc. All rights reserved.
机译:对于分布式存储系统上的稀疏矩阵矢量乘法(SpMV)的并行化,基于非零的细粒度和中等粒度分区模型在所有分区模型中获得了最低的通信量和计算失衡。但是,这通常是以高消息数,即高等待时间开销为代价的。这项工作通过提出新的细粒度和中等粒度模型来解决此缺点,该模型能够在单个分区阶段中将通信量和消息数最小化。新模型利用消息网来封装总消息数的最小化。我们通过为消息网络建议延迟添加和阈值来进一步微调这些模型,以便在使通信量最小化和消息数量最小的相互冲突的目标之间进行权衡。在将近一千个矩阵的广泛数据集上的实验表明,所提出的模型将原始基于非零模型的总消息数平均提高了27%,这反映在SpMV的并行运行时间上,作为平均减少512个处理器上的15%。 (C)2018 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号