首页> 外文会议>International Meshing Roundtable >A Hybrid Parallel Delaunay Image-to-Mesh Conversion Algorithm Scalable on Distributed-Memory Clusters
【24h】

A Hybrid Parallel Delaunay Image-to-Mesh Conversion Algorithm Scalable on Distributed-Memory Clusters

机译:一个混合并行Delaunay图像到网状转换算法可扩展在分布式存储器集群上

获取原文

摘要

In this paper, we present a scalable three dimensional hybrid MPI+Threads parallel Delaunay image-to-mesh conversion algorithm. A nested master-worker communication model for parallel mesh generation is implemented which simultaneously explores process-level parallelization and thread-level parallelization: inter-node communication using MPI and inter-core communication inside one node using threads. In order to overlap the communication (task request and data movement) and computation (parallel mesh refinement), the inter-node MPI communication and intra-node local mesh refinement is separated. The master thread that initializes the MPI environment is in charge of the inter-node MPI communication while the worker threads of each process are only responsible for the local mesh refinement within the node. We conducted a set of experiments to test the performance of the algorithm on Turing, a distributed memory cluster at Old Dominion University High Performance Computing Center and observed that the granularity of coarse level data decomposition, which affects the coarse level concurrency, has a significant influence on the performance of the algorithm. With the proper value of granularity, the algorithm expresses impressive performance potential and is scalable to 30 distributed memory compute nodes with 20 cores each (the maximum number of nodes available for us in the experiments).
机译:在本文中,我们呈现可扩展的三维混合MPI +线程并行Delaunay图像到网格转换算法。实现了一个用于并行网格生成的嵌套的主工作人员通信模型,其同时探讨处理级并行化和线程级并行化:使用线程内的一个节点内使用MPI和核心间通信的节点间通信。为了与通信(任务请求和数据移动)和计算重叠(并行网格细化),分离节点间MPI通信和节点内部网格细化。初始化MPI环境的主线程负责节点间MPI通信,而每个进程的工作线程仅负责节点内的本地网格细化。我们进行了一组实验来测试旧Dominion大学高性能计算中心的分布式存储器集群算法的性能,并观察到影响粗级并发性的粗级数据分解的粒度具有显着影响论算法的性能。凭借粒度的适当值,该算法表达了令人印象深刻的性能潜力,并且可以缩放到30个分布式存储器计算节点,其中20个核心(实验中可用的最大节点数量)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号