首页> 外文学位 >Efficient Parallelization of Non-uniform Fast Multipole Algorithms
【24h】

Efficient Parallelization of Non-uniform Fast Multipole Algorithms

机译:非均匀快速多极算法的有效并行化

获取原文
获取原文并翻译 | 示例

摘要

Many applications of the N-body problem today involve distributions of bodies that are (i) very large and (ii) highly non-uniform. A variety of fast multipole algorithms have been devised to reduce the cost from O(N2) to O(N log N) or O(N) for oscillatory and non-oscillatory problems, respectively. The issue of non-uniformity, however, presents significant challenges in parallelization, requiring a much more nuanced approach. Compounding this challenge, oscillatory N-body problems arising from wave physics (electromagnetics, acoustics, etc.) are burdened with capturing both phase and amplitude information as opposed to just the amplitude; non-uniformity even further complicates things. As a result, the algorithm and underlying data structures become extremely complicated, and parallelization becomes quite difficult.;This thesis aims to develop novel parallel fast multipole methods for both oscillatory and non-oscillatory problems that (i) are controllably accurate to arbitrary precision, (ii) are capable of efficiently handling highly non-uniform distributions, and (iii) scale well up to extremely large problem sizes and numbers of CPU cores. The accelerated Cartesian expansion (ACE) method and wideband multilevel fast multipole algorithm (MLFMA) are modified to accurately and efficiently accommodate non-uniform, and in the case of MLFMA extremely large, distributions in parallel. Several parallel algorithms for efficiently building the distributed non-uniform tree data structures are developed. Effective, novel algorithms are introduced to reduce load imbalances arising from non-uniformity and certain idiosyncrasies of the parallel wideband MLFMA which hamper scalability. The algorithms presented here meet each of the stated goals, enabling computations involving several hundred million degrees of freedom on 2048 cores for an electromagnetics problem and several billion particles on 16,384 cores for non-oscillatory problems.
机译:今天,N体问题的许多应用涉及(i)非常大和(ii)高度不均匀的物体分布。已经设计出各种快速的多极算法来分别将振荡和非振荡问题的成本从O(N2)降低到O(N log N)或O(N)。但是,非均匀性问题在并行化方面提出了重大挑战,需要更细致入微的方法。使这一挑战更为复杂的是,由波物理学(电磁学,声学等)引起的振荡N体问题不仅要捕获振幅,而且要捕获相位和振幅信息,因此负担重。不均匀甚至使事情变得更加复杂。结果,算法和基础数据结构变得极其复杂,并且并行化变得相当困难。;本文旨在为振荡和非振荡问题开发新颖的并行快速多极方法,(i)可控制地精确到任意精度, (ii)能够有效处理高度不均匀的分布,并且(iii)可以很好地扩展到非常大的问题大小和CPU内核数。修改了加速笛卡尔展开(ACE)方法和宽带多级快速多极子算法(MLFMA),以准确而有效地适应非均匀分布,在MLFMA极大的情况下,可以并行分布。开发了几种并行算法来有效地构建分布式非均匀树数据结构。引入了有效的新颖算法,以减少由于并行宽带MLFMA的不均匀性和某些特质性而导致的负载不平衡,从而阻碍了可伸缩性。此处提出的算法可以满足上述每个目标,从而可以针对电磁问题在2048个磁芯上进行几亿个自由度的计算,而对于非振荡性问题,可以在16384个磁芯上进行数十亿个粒子的计算。

著录项

  • 作者

    Hughey, Stephen Michael.;

  • 作者单位

    Michigan State University.;

  • 授予单位 Michigan State University.;
  • 学科 Computational physics.;Electromagnetics.;Electrical engineering.
  • 学位 Ph.D.
  • 年度 2018
  • 页码 105 p.
  • 总页数 105
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:53:10

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号