A parallel log barrier-based mesh warping algorithm for distributed memory machines

Thap Panitanarak; Suzanne M. Shontz

首页> 外文期刊>Engineering with Computers >A parallel log barrier-based mesh warping algorithm for distributed memory machines

【24h】

A parallel log barrier-based mesh warping algorithm for distributed memory machines

机译：基于并行对数屏障的分布式存储机器网格变形算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Parallel dynamic meshes are essential for computational simulations of large-scale scientific applications involving motion. To address this need, we propose parallel LBWARP, a parallel log barrier-based tetrahedral mesh warping algorithm for distributed memory machines. Our algorithm is a general-purpose, geometric mesh warping algorithm that parallelizes the sequential LBWARP algorithm proposed by Shontz and Vavasis. The first step of the algorithm involves computation of a set of local weights for each interior node which describe the relative distances of the node to each of its neighbors. The weight computation step is the most time consuming in the parallel algorithm. Based on our choice of the mesh partition and the corresponding distribution of data and assignment of tasks to processors, communication among processors is avoided in an embarrassingly parallel computation of the weights. Once this representation of the initial mesh is determined, a target deformation of the boundary is applied, also in an embarrassingly parallel manner. Finally, new coordinates of the interior nodes are obtained by solving a system of linear equations with multiple right-hand sides that is based on the weights and boundary deformation. This linear system can be solved using one of three parallel sparse linear solvers, i.e., the distributed block BiCG, block GMRES, or LU algorithm, all of which support the solution of linear systems with multiple right-hand side vectors. Our numerical results demonstrate good efficiency and strong scalability of parallel LBWARP on up to 64 processors, as the experiments show close to linear speedup in all cases. Weak scalability is also demonstrated. The performance of the parallel sparse linear solvers is dependent on factors such as the mesh size, the amount of available memory, and the number of processors. For example, the distributed LU algorithm gives better performance on small meshes, whereas the distributed block BiCG and distributed block GMRES algorithms yield better performance when the amount of available memory is limited. Finally, we demonstrate the parallel LBWARP performance for a sequence of mesh deformations which can significantly reduce the runtime of the overall algorithm. When applied to k deformations, parallel LBWARP reuses the weight matrix, that was computed during the first deformation, when the distributed LU linear solver is employed. This gives close to k-time performance for sufficiently many deformations.

机译：并行动态网格对于涉及运动的大规模科学应用的计算仿真至关重要。为了满足这一需求，我们提出了并行LBWARP，这是一种用于分布式存储机器的基于对数屏障的并行四面体网格变形算法。我们的算法是通用的几何网格变形算法，可并行化Shontz和Vavasis提出的顺序LBWARP算法。该算法的第一步涉及为每个内部节点计算一组局部权重，这些局部权重描述了该节点到其每个邻居的相对距离。在并行算法中，权重计算步骤最耗时。基于我们对网格划分的选择以及数据的相应分配以及任务对处理器的分配，在尴尬的并行权重计算中避免了处理器之间的通信。一旦确定了初始网格的这种表示形式，就以尴尬的平行方式应用了边界的目标变形。最后，通过求解基于权重和边界变形的具有多个右侧的线性方程组，获得内部节点的新坐标。可以使用三个并行稀疏线性求解器之一（即分布式块BiCG，块GMRES或LU算法）来求解此线性系统，所有这些都支持具有多个右侧向量的线性系统的求解。我们的数值结果证明了在多达64个处理器上并行LBWARP的良好效率和强大可伸缩性，因为实验表明在所有情况下接近线性加速。还显示了可伸缩性较弱。并行稀疏线性求解器的性能取决于诸如网格大小，可用内存量和处理器数量之类的因素。例如，分布式LU算法在较小的网格上可提供更好的性能，而当可用内存量有限时，分布式块BiCG和分布式块GMRES算法可获得更好的性能。最后，我们证明了一系列网格变形的并行LBWARP性能，可以显着减少整个算法的运行时间。当应用于k个变形时，当使用分布式LU线性求解器时，并行LBWARP会重用权重矩阵，该权重矩阵是在第一次变形期间计算的。对于足够多的变形，这提供了接近k时间的性能。

著录项

来源
《Engineering with Computers》 |2018年第1期|59-76|共18页
作者
Thap Panitanarak; Suzanne M. Shontz;
展开▼
作者单位

Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA;

Department of Electrical Engineering and Computer Science, Bioengineering Graduate Program, Information and Telecommunication Technology Center, University of Kansas, Lawrence, KS 66045, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Parallel computing; Mesh warping; Mesh deformation; Log-barrier method; Tetrahedral meshes; Sparse linear solvers; Multiple right-hand side problem;

机译：并行计算网格变形;网格变形;对数屏障方法;四面体网格;稀疏线性求解器;多个右侧问题;

相似文献

外文文献
中文文献
专利

1. An efficient parallel algorithm for O(N~2) direct summation method and its variations on distributed-memory parallel machines [J] . Junichiro Makino New astronomy . 2002,第7期

机译：O（N〜2）直接求和方法的高效并行算法及其在分布式内存并行机上的改进
2. Generating Multibillion Element Unstructured Meshes on Distributed Memory Parallel Machines [J] . Soner Seren, Ozturan Can Scientific programming . 2015,第期

机译：在分布式内存并行机上生成数十亿个元素的非结构化网格
3. A hybrid parallel Delaunay image-to-mesh conversion algorithm scalable on distributed-memory clusters [J] . Feng Daming, Chernikov Andrey N., Chrisochoides Nikos P. Computer-Aided Design . 2018,第期

机译：一个混合并行Delaunay图像到网状转换算法可扩展在分布式存储器集群上
4. A high-order log barrier-based mesh generation and warping method [C] . Mike Stees, Suzanne M. Shontz International Meshing Roundtable . 2017

机译：基于高阶日志屏障的网格生成和翘曲方法
5. Abstract Graph Machine: Modeling Orderings in Asynchronous Distributed-Memory Parallel Graph Algorithms [D] . Kanewala, Thejaka Amila. 2018

机译：抽象图机：异步分布式内存并行图算法中的建模顺序
6. A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform [O] . Yu-Shuang Dong, Gao-Chao Xu, Xiao-Dong Fu -1

机译：云平台上虚拟机部署的分布式并行遗传算法
7. An efficient parallel algorithm for O(N^2) direct summation method and its variations on distributed-memory parallel machines [O] . Makino, J 2001

机译：O（N ^ 2）直接求和方法的高效并行算法及其在分布式内存并行机上的改进
8. DIME (Distributed Irregular Mesh Environment): A Programming Environment for Unstructured Triangular Meshes on a Distributed-Memory Parallel Processor [R] . Williams, R. D. 1987

机译：DImE（分布式不规则网格环境）：分布式内存并行处理器上非结构化三角网格的编程环境

A parallel log barrier-based mesh warping algorithm for distributed memory machines

摘要

著录项

相似文献

相关主题

期刊订阅