Performance Analysis of Accelerator Architectures and Programming Models for Parareal Algorithm Solutions of Ordinary Differential Equations

Sumathi Lakshmiranganatha; Suresh S. Muknahallipatna

摘要

Increasing needs for the study of complex dynamical systems require computing solutions of a large number of ordinary and partial differential time-dependent equations in near real-time. Numerical integration algorithms, which are computationally expensive and inherently sequential, are typically used to compute solutions of ordinary and partial differential time-dependent equations. This presents challenges to study complex dynamical systems in near real-time. This paper examines the challenges of computing solutions of ordinary differential time-dependent equations using the Parareal algorithm belonging to the class of parallel-in-time algorithms on various high-performance computing accelerator-based architectures and associated programming models. The paper presents the code refactoring steps and performance analysis of the Parareal algorithm on two accelerator computing architectures: the Intel Xeon Phi CPU and Graphics Processing Unit many-core architectures, and with OpenMP, OpenACC, and CUDA programming models. The speedup and scaling performance analysis are used to demonstrate the suitability of the Parareal to compute the solutions of a single ordinary differential time-dependent equation and a family of interdependent ordinary differential time-dependent. The speedup, weak and strong scaling results demonstrate the suitability of Graphical Processing Units with the CUDA programming model as the most efficient accelerator for computing solutions of ordinary differential time-dependent equations using parallel-in-time algorithms. Considering the time and effort required to refactor the code for execution on the accelerator architectures, the Graphical Processing Units with the OpenACC programming model is the most efficient accelerator for computing solutions of ordinary differential time-dependent equations using parallel-in-time algorithms.

机译：增加对复杂动态系统的研究的需求需要在近乎实时计算大量普通和部分差分时间相关方程的解决方案。计算昂贵且固有顺序的数值积分算法通常用于计算普通和部分差分时间相关方程的解。这提出了在近期实时研究复杂动态系统的挑战。本文研究了使用属于各种高性能计算加速器的架构和相关编程模型的平行时间算法的宫差分时间依赖方程计算解决方案的挑战。本文介绍了两个加速器计算架构上子术算法的代码重构步骤和性能分析：英特尔Xeon Phi CPU和图形处理单元多核架构，以及OpenMP，OPENACC和CUDA编程模型。加速和缩放性能分析用于展示宫序列的适用性来计算单个常见差分时间相关方程的解决方案和一个相互依存的常见差分时间依赖性的求和。加速，弱和强大的缩放结果表明图形处理单元与CUDA编程模型的适用性是使用平行时间算法计算常用时间相关方程的解决方案的最有效的加速器。考虑到重构在加速器架构上执行代码所需的时间和精力，具有OpenACC编程模型的图形处理单元是使用并行内算法计算常差分时间相关方程的解决方案最有效的加速器。

Performance Analysis of Accelerator Architectures and Programming Models for Parareal Algorithm Solutions of Ordinary Differential Equations

摘要

著录项

相关主题

期刊订阅