Introducing multi-level parallelism, at coarse, fine and instruction level to enhance the performance of iterative solvers for large sparse linear systems on Multi- and Many-core architecture

机译：引入多级并行性，处于粗，精细和教学水平，以增强迭代求解器对多核和多核架构的大型稀疏线性系统的性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the evolution of High Performance Computing, multi-core and many-core systems are now a common feature of new hardware architectures. The introduction of very large number of cores at the processor level is challenging because it requires to handle multi level parallelism at various levels either coarse or fine to fully take advantage of the offered computing power. The induced programming effort can be fixed with parallel programming models based on the data flow model and the task programming paradigm [1]. To do so many of the standard numerical algorithms must be revisited as they cannot be easily parallelized at the finest levels. Iterative linear solvers are a key part of petroleum reservoir simulation as they can represent up to 80% of the total computing time. In these algorithms, the standard preconditioning methods for large, sparse and unstructured matrices - such as Incomplete LU Factorization (ILU) or Algebraic Multigrid (AMG) - fail to scale on shared-memory architectures with large number of cores. In this paper we reconsider preconditioning algorithms to better introduce multi-level parallelism at both coarse level with MPI, fine level with threads and at the instruction level to enable SIMD optimizations. This paper illustrates how we enhance the implementation of preconditioners like the multilevel domain decomposition (DDML) preconditioners [2], based on the popular Additive Schwartz Method (ASM), or the classical ILU0 preconditioner with the fine grained parallel fixed point variant presented in [3]. Our approach is validated on linear systems extracted from realistic petroleum reservoir simulations. The robustness of the preconditioners is tested with respect to the data heterogeneities of the study cases. We evaluate the extensibility of our implementation regarding the model sizes and its scalability regarding the large number of cores provided by new KNL processors or multi-nodes clusters.

机译：随着高性能计算的演变，多核和许多核心系统现在是新硬件架构的共同特征。在处理器级别引入非常大量的核心是具有挑战性的，因为它需要在粗糙或精细地处理各个级别的多级并行性，以充分利用所提供的计算能力。诱导的编程工作可以通过基于数据流模型和任务编程范例[1]使用并行编程模型来固定。为此，必须重新访问许多标准数值算法，因为它们不能在最好的水平下轻松地平行化。迭代线性溶剂是石油储层模拟的关键部分，因为它们可以代表总计算时间的80％。在这些算法中，用于大，稀疏和非结构化矩阵的标准预处理方法 - 例如不完整的LU分解（ILU）或代数Multigrid（AMG） - 没有缩放具有大量核心的共享内存架构。在本文中，我们重新考虑了预处理算法，以更好地使用MPI，细级别的粗级引入多级并行度，带有线程和指令级别，以实现SIMD优化。本文说明了如何基于流行的添加剂Schwartz方法（ASM）或具有[的经典ILU0预处理器，所述多级域分解（DDML）预处理器[2]等预处理器的实现方式3]。我们的方法在从逼真的石油储层模拟中提取的线性系统上验证。关于研究病例的数据异质性测试了预处理器的鲁棒性。我们评估了我们对模型大小的可扩展性及其关于新KNL处理器或多节点集群提供的大量核心的可扩展性。

著录项

来源
《Workshop on the LLVM Compiler Infrastructure in HPC;International Conference for High Performance Computing, Networking, Storage and Analysis;Workshop on Hierarchical Parallelism for Exascale Computing》|2020年|85-95|共11页
会议地点
作者
Jean-Marc Gratien;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Parallel processing; Data models; Computational modeling; Runtime; Computer architecture; Scheduling;

机译：任务分析;并行处理;数据模型;计算建模;运行时;计算机架构;调度;

相似文献

外文文献
专利

1. A simulation suite for Lattice-Boltzmann based real-time CFD applications exploiting multi-level parallelism on modern multi- and many-core architectures [J] . Markus Geveler, Dirk Ribbrock, Sven Mallach, Journal of computational science . 2011,第2期

机译：针对基于莱迪思-玻尔兹曼的实时CFD应用程序的仿真套件，该模块利用了现代多核和多核架构上的多级并行性
2. Exploiting thread-level parallelism in the iterative solution of sparse linear systems [J] . Jose I. Aliaga, Matthias Bollhofer, Alberto F. Martin, Parallel Computing . 2011,第3期

机译：在稀疏线性系统的迭代解决方案中利用线程级并行性
3. Enhanced multi-level block ILU preconditioning strategies for general sparse linear systems [J] . Yousef Saad, Jun Zhang Journal of Computational and Applied Mathematics . 2001,第1a2期

机译：通用稀疏线性系统的增强型多级块ILU预处理策略
4. Performance Assessment of Hybrid Parallelism for Large-Scale Reservoir Simulation on Multi- and Many-core Architectures [C] . Amani AlOnazi, Marcin Rogowski, Ahmed Al-Zawawi, IEEE High Performance Extreme Computing Conference . 2018

机译：基于多核和多核架构的大型油藏仿真的混合并行性能评估
5. Iterative Solver Selection Techniques for Sparse Linear Systems [D] . Sood, Kanika. 2019

机译：稀疏线性系统的迭代求解器选择技术
6. A Simulation Suite for Lattice-Boltzmann based Real-Time CFD Applications Exploiting Multi-Level Parallelism on modern Multi- and Many-Core Architectures [O] . Markus Gevelera, Dirk Ribbrocka, Sven Mallachb 2016

机译：基于Lattice-Boltzmann的实时CFD应用仿真套件，利用现代多核和多核架构的多级并行性

Introducing multi-level parallelism, at coarse, fine and instruction level to enhance the performance of iterative solvers for large sparse linear systems on Multi- and Many-core architecture

摘要

著录项

相似文献

相关主题

期刊订阅