首页> 外文会议>International conference on supercomputing >Run-time compilation for parallel sparse matrix computations
【24h】

Run-time compilation for parallel sparse matrix computations

机译:并行稀疏矩阵计算的运行时编译

获取原文
获取外文期刊封面目录资料

摘要

Run-time compilation techniques have been shown effective for automating the parallelization of loops with unstructured indirect data accessing patterns. However, it is still an open problem to efficiently parallelize sparse matrix factorizations commonly used in iterative numerical problems. The difficulty is that a factorization process contains irregularly-interleaved communication and computation with varying granularities and it is hard to obtain scalable performance on distributed memory machines. In this paper, we present an inspector/executor approach for parallelizing such applications by embodying automatic graph scheduling techniques to optimize interleaved communication and computation. We describe a run-time system called RAPID that provides a set of library functions for specifying irregular data objects and tasks that access these objects. The system extracts a task dependence graph from data access patterns, and executes tasks efficiently on a distributed memory machine. We discuss a set of optimization strategies used in this system and demonstrate the application of this system in parallelizing sparse Cholesky and LU factorizations.
机译:已显示运行时编译技术可有效地自动化具有非结构化间接数据访问模式的循环并行化。但是,有效并行化迭代数值问题中常用的稀疏矩阵分解仍然是一个悬而未决的问题。困难在于分解过程包含不规则交错的通信和具有不同粒度的计算,并且很难在分布式存储机器上获得可伸缩的性能。在本文中,我们提出了一种检查器/执行器方法,通过体现自动图调度技术来优化交错通信和计算,从而使此类应用程序并行化。我们描述了一个称为RAPID的运行时系统,该系统提供了一组库函数,用于指定不规则数据对象和访问这些对象的任务。该系统从数据访问模式中提取任务依赖图,并在分布式存储机器上有效地执行任务。我们讨论了该系统中使用的一组优化策略,并演示了该系统在并行化稀疏Cholesky和LU分解中的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号