Run-time compilation for parallel sparse matrix computations

机译：并行稀疏矩阵计算的运行时编译

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Run-time compilation techniques have been shown effective for automating the parallelization of loops with unstructured indirect data accessing patterns. However, it is still an open problem to efficiently parallelize sparse matrix factorizations commonly used in iterative numerical problems. The difficulty is that a factorization process contains irregularly-interleaved communication and computation with varying granularities and it is hard to obtain scalable performance on distributed memory machines. In this paper, we present an inspector/executor approach for parallelizing such applications by embodying automatic graph scheduling techniques to optimize interleaved communication and computation. We describe a run-time system called RAPID that provides a set of library functions for specifying irregular data objects and tasks that access these objects. The system extracts a task dependence graph from data access patterns, and executes tasks efficiently on a distributed memory machine. We discuss a set of optimization strategies used in this system and demonstrate the application of this system in parallelizing sparse Cholesky and LU factorizations.

机译：已显示运行时编译技术可有效地自动化具有非结构化间接数据访问模式的循环并行化。但是，有效并行化迭代数值问题中常用的稀疏矩阵分解仍然是一个悬而未决的问题。困难在于分解过程包含不规则交错的通信和具有不同粒度的计算，并且很难在分布式存储机器上获得可伸缩的性能。在本文中，我们提出了一种检查器/执行器方法，通过体现自动图调度技术来优化交错通信和计算，从而使此类应用程序并行化。我们描述了一个称为RAPID的运行时系统，该系统提供了一组库函数，用于指定不规则数据对象和访问这些对象的任务。该系统从数据访问模式中提取任务依赖图，并在分布式存储机器上有效地执行任务。我们讨论了该系统中使用的一组优化策略，并演示了该系统在并行化稀疏Cholesky和LU分解中的应用。

著录项

来源
《International conference on supercomputing》|1996年|p.237-244|共8页
会议地点
作者
Cong Fu; Tao Yang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Compile and Run-Time Support for the Parallelization of Sparse Matrix Updating Algorithms [J] . Gerardo Bandera, Manuel Ujaldon, Emilio L.Zapata Journal of supercomputing . 2000,第3期

机译：稀疏矩阵更新算法并行化的编译和运行时支持
2. Toward an automatic parallelization of sparse matrix computations [J] . Adle R, Aiguier M, Delaplace F Journal of Parallel and Distributed Computing . 2005,第3期

机译：走向稀疏矩阵计算的自动并行化
3. Maximizing Communication-Computation Overlap Through Automatic Parallelization and Run-time Tuning of Non-blocking Collective Operations [J] . Barigou Youcef, Gabriel Edgar International journal of parallel programming . 2017,第6期

机译：通过自动并行化和无阻塞集体操作的运行时调整来最大化通信计算重叠
4. Run-time compilation for parallel sparse matrix computations [C] . Cong Fu, Tao Yang, Association for Computing Machinery International conference on supercomputing . 1996

机译：并行稀疏矩阵计算的运行时编译
5. Run-time support and compilation methods for irregular computations on distributed memory parallel machines. [D] . Ponnusamy, Ravi. 1995

机译：分布式内存并行机上不规则计算的运行时支持和编译方法。
6. Turbo-SMT: Parallel Coupled Sparse Matrix-Tensor Factorizations and Applications [O] . Evangelos E. Papalexakis, Christos Faloutsos, Tom M. Mitchell, -1

机译：Turbo-SMT：并行耦合的稀疏矩阵张量分解和应用
7. Run-time Compilation for Parallel Sparse Matrix Computations [O] . Cong Fu, Tao Yang 1996

机译：并行稀疏矩阵计算的运行时编译
8. Parallel sparse matrix computations: Wavefront minimization of sparse matrices. Final report for the period ending June 14, 1998 [R] . 1999

机译：并行稀疏矩阵计算：稀疏矩阵的波前最小化。截至1998年6月14日的最终报告

Run-time compilation for parallel sparse matrix computations

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅