首页> 外文期刊>Distributed and Parallel Databases >Scaling sparse matrix-matrix multiplication in the accumulo database
【24h】

Scaling sparse matrix-matrix multiplication in the accumulo database

机译:在累积数据库中缩放稀疏矩阵-矩阵乘法

获取原文
获取原文并翻译 | 示例

摘要

We propose and implement a sparse matrix-matrix multiplication (SpGEMM) algorithm running on top of Accumulo's iterator framework which enables high performance distributed parallelism. The proposed algorithm provides write-locality while ingesting the output matrix back to database via utilizing row-by-row parallel SpGEMM. The proposed solution also alleviates scanning of input matrices multiple times by making use of Accumulo's batch scanning capability which is used for accessing multiple ranges of key-value pairs in parallel. Even though the use of batch-scanning introduces some latency overheads, these overheads are alleviated by the proposed solution and by using node-level parallelism structures. We also propose a matrix partitioning scheme which reduces the total communication volume and provides a balance of workload among servers. The results of extensive experiments performed on both real-world and synthetic sparse matrices show that the proposed algorithm scales significantly better than the outer-product parallel SpGEMM algorithm available in the Graphulo library. By applying the proposed matrix partitioning, the performance of the proposed algorithm is further improved considerably.
机译:我们提出并实现了一种在Accumulo迭代器框架之上运行的稀疏矩阵-矩阵乘法(SpGEMM)算法,该算法可实现高性能的分布式并行性。所提出的算法在提供写入局部性的同时,利用逐行并行SpGEMM将输出矩阵提取回数据库。所提出的解决方案还利用Accumulo的批量扫描功能减轻了输入矩阵的多次扫描,该功能用于并行访问多个范围的键-值对。即使使用批处理扫描会带来一些延迟开销,但这些提议的解决方案和使用节点级并行性结构都会减轻这些开销。我们还提出了一种矩阵分区方案,该方案可减少总通信量并提供服务器之间的工作负载平衡。在现实世界和合成稀疏矩阵上进行的大量实验的结果表明,与Graphulo库中可用的外部产品并行SpGEMM算法相比,所提出的算法可扩展性更好。通过应用所提出的矩阵划分,所提出的算法的性能被大大改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号