Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

MARTIN KRONBICHLER; KARL LJUNGKVIST

首页> 外文期刊>ACM Transactions on Parallel Computing >Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

【24h】

Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

机译：用于图形处理器的无矩阵高阶有限元计算的Multigrid

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article presents matrix-free finite-element techniques for efficiently solving partial differential equations on modern many-core processors, such as graphics cards. We develop a GPU parallelization of a matrix-free geometric multigrid iterative solver targeting moderate and high polynomial degrees, with support for general curved and adaptively refined hexahedral meshes with hanging nodes. The central algorithmic component is the matrix-free operator evaluation with sum factorization. We compare the node-level performance of our implementation running on an Nvidia Pascal P100 GPU to a highly optimized multicore implementation running on comparable Intel Broadwell CPUs and an Intel Xeon Phi. Our experiments show that the GPU implementation is approximately 1.5 to 2 times faster across four different scenarios of the Poisson equation and a variety of element degrees in 2D and 3D. The lowest time to solution per degree of freedom is recorded for moderate polynomial degrees between 3 and 5. A detailed performance analysis highlights the capabilities of the GPU architecture and the chosen execution model with threading within the element, particularly with respect to the evaluation of the matrix-vector product. Atomic intrinsics are shown to provide a fast way for avoiding the possible race conditions in summing the elemental residuals into the global vector associated to shared vertices, edges, and surfaces. In addition, the solver infrastructure allows for using mixed-precision arithmetic that performs the multigrid V-cycle in single precision with an outer correction in double precision, increasing throughput by up to 83%.

机译：本文介绍了无矩阵的有限元技术，可以有效地解决现代多核处理器（例如图形卡）上的偏微分方程。我们针对中高阶多项式开发了无矩阵几何多重网格迭代求解器的GPU并行化，并支持带有悬挂节点的一般曲面和自适应精制六面体网格。核心算法组件是具有求和因子分解的无矩阵算子评估。我们将在Nvidia Pascal P100 GPU上运行的实现与在可比较的Intel Broadwell CPU和Intel Xeon Phi上运行的高度优化的多核实现进行了节点级性能比较。我们的实验表明，在Poisson方程的四种不同情况以及2D和3D中各种元素度的情况下，GPU的实现速度大约快1.5到2倍。对于3到5之间的中等多项式，记录了每个自由度最少的求解时间。详细的性能分析着重介绍了GPU架构的功能以及所选择的执行模型以及元素内的线程，尤其是在评估性能方面。矩阵向量积。在将元素残差求和到与共享顶点，边和曲面关联的全局矢量中时，显示出原子内在函数为避免可能的竞争条件提供了一种快速方法。此外，求解器基础架构允许使用混合精度算法，该算法以单精度执行多网格V循环，以双精度进行外部校正，从而将吞吐量提高了83％。

著录项

来源
《ACM Transactions on Parallel Computing》 |2019年第1期|3-34|共32页
作者
MARTIN KRONBICHLER; KARL LJUNGKVIST;
展开▼
作者单位

Technical University of Munich;

Uppsala University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Finite element method; sum factorization; matrix-free method; geometric multigrid; CUDA;

机译：有限元法;和分解无矩阵法几何多重网格卡达;

相似文献

外文文献
中文文献
专利

1. Multigrid for matrix-free high-order finite element computations on graphics processors [J] . Amos Olagunju Computing reviews . 2021,第7期

机译：用于图形处理器的免费高阶有限元计算的MultiGridrid
2. Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors [J] . MARTIN KRONBICHLER, KARL LJUNGKVIST ACM Transactions on Parallel Computing . 2019,第1期

机译：用于图形处理器上的无矩阵高阶有限元计算的多重资源
3. A scalable, matrix-free multigrid preconditioner for finite element discretizations of heterogeneous Stokes flow [J] . May D. A., Brown J., Le Pourhiet L. Computer Methods in Applied Mechanics and Engineering . 2015,第juna15期

机译：可扩展的无矩阵多网格预处理器，用于异构斯托克斯流的有限元离散化
4. MATRIX-FREE FINITE-ELEMENT COMPUTATIONS ON GRAPHICS PROCESSORS WITH ADAPTIVELY REFINED UNSTRUCTURED MESHES [C] . Karl Ljungkvist Simulation Multi-Conference . 2017

机译：图形处理器的无矩阵有限元计算，具有自适应地精制非结构化网格
5. Investigation of general-purpose computing on graphics processing units and its application to the finite element analysis of electromagnetic problems. [D] . Meng, Huan-Ting. 2015

机译：图形处理单元上通用计算的研究及其在电磁问题的有限元分析中的应用。
6. Three-dimensional computational model simulating the fracture healing process with both biphasic poroelastic finite element analysis and fuzzy logic control [O] . Monan Wang, Ning Yang -1

机译：三维计算模型通过双相多孔弹性有限元分析和模糊逻辑控制模拟骨折愈合过程
7. Efficient Finite Element Geometric Multigrid Solvers for Unstructured Grids on Graphics Processing Units [O] . M. Geveler, D. Ribbrock, D. Göddeke, -1

机译：高效有限元几何多重格子求解图形处理单元上的非结构化网格

Multigrid for Matrix-Free High-Order Finite Element Computations on Graphics Processors

摘要

著录项

相似文献

相关主题

期刊订阅