首页> 外文会议>Proceedings of 9th Workshop on Latest Advances in Scalable Algorithms for LargeScale Systems >A General-Purpose Hierarchical Mesh Partitioning Method with Node Balancing Strategies for Large-Scale Numerical Simulations
【24h】

A General-Purpose Hierarchical Mesh Partitioning Method with Node Balancing Strategies for Large-Scale Numerical Simulations

机译:大规模数值模拟的带有节点平衡策略的通用层次网格划分方法

获取原文
获取原文并翻译 | 示例

摘要

Large-scale parallel numerical simulations are essential for a wide range of engineering problems that involve complex, coupled physical processes interacting across a broad range of spatial and temporal scales. The data structures involved in such simulations (meshes, sparse matrices, etc.) are frequently represented as graphs, and these graphs must be optimally partitioned across the available computational resources in order for the underlying calculations to scale efficiently. Partitions which minimize the number of graph edges that are cut (edge-cuts) while simultaneously maintaining a balance in the amount of work (i.e. graph nodes) assigned to each processor core are desirable, and the performance of most existing partitioning software begins to degrade in this metric for partitions with more than than $O(10^3)$ processor cores. In this work, we consider a general-purpose hierarchical partitioner which takes into account the existence of multiple processor cores and shared memory in a compute node while partitioning a graph into an arbitrary number of subgraphs. We demonstrate that our algorithms significantly improve the preconditioning efficiency and overall performance of realistic numerical simulations running on up to 32,768 processor cores with nearly $10^9$ unknowns.
机译:大规模并行数值模拟对于涉及复杂的耦合物理过程的广泛工程问题是至关重要的,这些物理过程在广泛的时空尺度上相互作用。此类模拟中涉及的数据结构(网格,稀疏矩阵等)通常表示为图形,并且这些图形必须在可用的计算资源上进行最佳划分,以便有效地扩展基础计算。需要这样的分区,它们可以最大程度地减少被切割的图形边缘(edge-cuts)的数量,同时保持分配给每个处理器核心的工作量(即图形节点)的平衡,并且大多数现有分区软件的性能开始下降。对于具有超过$ O(10 ^ 3)$个处理器核心的分区,此指标适用。在这项工作中,我们考虑了一个通用的分层分区器,该分区器在将图形划分为任意数量的子图形时,考虑了计算节点中多个处理器内核和共享内存的存在。我们证明了我们的算法显着提高了在多达32,768个处理器核上运行的逼真的数值模拟的预处理效率和整体性能,其中未知数约为10 ^ 9 $。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号