首页> 外文OA文献 >Parallel algorithms for Monte Carlo particle transport simulation on exascale computing architectures
【2h】

Parallel algorithms for Monte Carlo particle transport simulation on exascale computing architectures

机译:用于百万亿次级计算架构的蒙特卡罗粒子传输模拟的并行算法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Monte Carlo particle transport methods are being considered as a viable option for high-fidelity simulation of nuclear reactors. While Monte Carlo methods offer several potential advantages over deterministic methods, there are a number of algorithmic shortcomings that would prevent their immediate adoption for full-core analyses. In this thesis, algorithms are proposed both to ameliorate the degradation in parallal efficiency typically observed for large numbers of processors and to offer a means of decomposing large tally data that will be needed for reactor analysis. A nearest-neighbor fission bank algorithm was proposed and subsequently implemented in the OpenMC Monte Carlo code. A theoretical analysis of the communication pattern shows that the expected cost is O([square root]N) whereas traditional fission bank algorithms are O(N) at best. The algorithm was tested on two supercomputers, the Intrepid Blue Gene/P and the Titan Cray XK7, and demonstrated nearly linear parallel scaling up to 163,840 processor cores on a full-core benchmark problem. An algorithm for reducing network communication arising from tally reduction was analyzed and implemented in OpenMC. The proposed algorithm groups only particle histories on a single processor into batches for tally purposes - in doing so it prevents all network communication for tallies until the very end of the simulation. The algorithm was tested, again on a full-core benchmark, and shown to reduce network communication substantially. A model was developed to predict the impact of load imbalances on the performance of domain decomposed simulations. The analysis demonstrated that load imbalances in domain decomposed simulations arise from two distinct phenomena: non-uniform particle densities and non-uniform spatial leakage. The dominant performance penalty for domain decomposition was shown to come from these physical effects rather than insufficient network bandwidth or high latency. The model predictions were verified with measured data from simulations in OpenMC on a full-core benchmark problem. Finally, a novel algorithm for decomposing large tally data was proposed, analyzed, and implemented/tested in OpenMC. The algorithm relies on disjoint sets of compute processes and tally servers. The analysis showed that for a range of parameters relevant to LWR analysis, the tally server algorithm should perform with minimal overhead. Tests were performed on Intrepid and Titan and demonstrated that the algorithm did indeed perform well over a wide range of parameters.
机译:蒙特卡洛粒子传输方法被认为是核反应堆高保真模拟的可行选择。尽管蒙特卡洛方法相对于确定性方法具有一些潜在的优势,但存在许多算法上的缺陷,这些缺陷会阻止它们立即用于全核分析。在本文中,提出了既可以减轻通常在大量处理器上观察到的并行效率下降的算法,又可以提供一种分解反应堆分析所需的大计数数据的方法。提出了最近邻裂变库算法,并随后在OpenMC Monte Carlo代码中实现。对通信模式的理论分析表明,预期成本为O(N),而传统裂变库算法充其量为O(N)。该算法在Intrepid Blue Gene / P和Titan Cray XK7两台超级计算机上进行了测试,并在全核基准测试问题上演示了近线性并行扩展,最多可扩展到163,840个处理器核。在OpenMC中分析并实现了一种用于减少因计数减少而引起的网络通信的算法。提出的算法仅将单个处理器上的粒子历史记录分组以进行计数-这样做可以防止所有网络通信进行计数,直到模拟结束为止。再次在全核基准测试中对该算法进行了测试,结果表明该算法可大大减少网络通信。开发了一个模型来预测负载不平衡对域分解仿真性能的影响。分析表明,域分解模拟中的负载不平衡是由两个不同的现象引起的:颗粒密度不均匀和空间泄漏不均匀。事实证明,域分解的主要性能损失来自这些物理效应,而不是网络带宽不足或等待时间较长。在全核心基准测试问题上,使用OpenMC中的模拟测量数据验证了模型预测。最后,提出了一种新的分解大理货数据的算法,并在OpenMC中进行了测试。该算法依赖于不相交的计算过程和理货服务器集。分析表明,对于与LWR分析相关的一系列参数,理货服务器算法应以最小的开销执行。在Intrepid和Titan上进行了测试,证明该算法在各种参数范围内的确表现良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号