首页> 外文学位 >The Development of WARP - A Framework for Continuous Energy Monte Carlo Neutron Transport in General 3D Geometries on GPUs.
【24h】

The Development of WARP - A Framework for Continuous Energy Monte Carlo Neutron Transport in General 3D Geometries on GPUs.

机译:WARP的开发-GPU上一般3D几何形状中连续能量蒙特卡洛中子传输的框架。

获取原文
获取原文并翻译 | 示例

摘要

Graphics processing units, or GPUs, have gradually increased in computational power from the small, job-specific boards of the early 1990s to the programmable powerhouses of today. Compared to more common central processing units, or CPUs, GPUs have a higher aggregate memory bandwidth, much higher floating-point operations per second (FLOPS), and lower energy consumption per FLOP. Because one of the main obstacles in exascale computing is power consumption, many new supercomputing platforms are gaining much of their computational capacity by incorporating GPUs into their compute nodes. Since CPU-optimized parallel algorithms are not directly portable to GPU architectures (or at least not without losing substantial performance), transport codes need to be rewritten to execute efficiently on GPUs. Unless this is done, reactor simulations cannot take full advantage of these new supercomputers.;WARP, which can stand for ``Weaving All the Random Particles,'' is a three-dimensional (3D) continuous energy Monte Carlo neutron transport code developed in this work as to efficiently implement a continuous energy Monte Carlo neutron transport algorithm on a GPU. WARP accelerates Monte Carlo simulations while preserving the benefits of using the Monte Carlo Method, namely, very few physical and geometrical simplifications. WARP is able to calculate multiplication factors, flux tallies, and fission source distributions for time-independent problems, and can run in both criticality or fixed source modes. WARP can transport neutrons in unrestricted arrangements of parallelepipeds, hexagonal prisms, cylinders, and spheres.;WARP uses an event-based algorithm, but with some important differences. Moving data is expensive, so WARP uses a remapping vector of pointer/index pairs to direct GPU threads to the data they need to access. The remapping vector is sorted by reaction type after every transport iteration using a high-efficiency parallel radix sort, which serves to keep the reaction types as contiguous as possible and removes completed histories from the transport cycle. The sort reduces the amount of divergence in GPU ``thread blocks,'' keeps the SIMD units as full as possible, and eliminates using memory bandwidth to check if a neutron in the batch has been terminated or not. Using a remapping vector means the data access pattern is irregular, but this is mitigated by using large batch sizes where the GPU can effectively eliminate the high cost of irregular global memory access.;WARP modifies the standard unionized energy grid implementation to reduce memory traffic. Instead of storing a matrix of pointers indexed by reaction type and energy, WARP stores three matrices. The first contains cross section values, the second contains pointers to angular distributions, and a third contains pointers to energy distributions. This linked list type of layout increases memory usage, but lowers the number of data loads that are needed to determine a reaction by eliminating a pointer load to find a cross section value.;Optimized, high-performance GPU code libraries are also used by WARP wherever possible. The CUDA performance primitives (CUDPP) library is used to perform the parallel reductions, sorts and sums, the CURAND library is used to seed the linear congruential random number generators, and the OptiX ray tracing framework is used for geometry representation. OptiX is a highly-optimized library developed by NVIDIA that automatically builds hierarchical acceleration structures around user-input geometry so only surfaces along a ray line need to be queried in ray tracing. WARP also performs material and cell number queries with OptiX by using a point-in-polygon like algorithm.;WARP has shown that GPUs are an effective platform for performing Monte Carlo neutron transport with continuous energy cross sections. Currently, WARP is the most detailed and feature-rich program in existence for performing continuous energy Monte Carlo neutron transport in general 3D geometries on GPUs, but compared to production codes like Serpent and MCNP, WARP has limited capabilities. Despite WARP's lack of features, its novel algorithm implementations show that high performance can be achieved on a GPU despite the inherently divergent program flow and sparse data access patterns. WARP is not ready for everyday nuclear reactor calculations, but is a good platform for further development of GPU-accelerated Monte Carlo neutron transport. In it's current state, it may be a useful tool for multiplication factor searches, i.e. determining reactivity coefficients by perturbing material densities or temperatures, since these types of calculations typically do not require many flux tallies. (Abstract shortened by UMI.).
机译:从1990年代初的小型特定工作板到今天的可编程强机,图形处理单元或GPU的计算能力已逐渐提高。与更常见的中央处理器或CPU相比,GPU具有更高的聚合内存带宽,每秒更高的浮点运算(FLOPS)和更低的每FLOP能耗。由于百亿亿次计算的主要障碍之一是功耗,因此许多新的超级计算平台通过将GPU集成到计算节点中来获得大量计算能力。由于CPU优化的并行算法不能直接移植到GPU架构(或至少不会损失大量性能),因此需要重写传输代码以在GPU上有效执行。除非这样做,否则反应堆模拟将无法充分利用这些新的超级计算机.WARP可以代表``编织所有随机粒子'',是在3D中开发的三维(3D)连续能量蒙特卡洛中子传输代码。这项工作旨在有效地在GPU上实现连续能量蒙特卡洛中子传输算法。 WARP加速了蒙特卡洛模拟,同时保留了使用蒙特卡洛方法的好处,即几乎没有物理和几何简化。 WARP能够计算与时间无关的问题的倍增因子,通量计数和裂变源分布,并且可以在临界或固定源模式下运行。 WARP可以以不受限制的平行六面体,六边形棱柱,圆柱体和球体的形式运输中子。WARP使用基于事件的算法,但有一些重要的区别。移动数据非常昂贵,因此WARP使用指针/索引对的重映射向量将GPU线程定向到他们需要访问的数据。重新映射向量在每次传输迭代后使用高效的并行基数排序按反应类型排序,这有助于保持反应类型尽可能连续,并从传输循环中删除完整的历史记录。这种排序减少了GPU``线程块''中的差异量,使SIMD单元保持尽可能满,并消除了使用内存带宽来检查批次中的中子是否已终止。使用重映射向量意味着数据访问模式是不规则的,但是通过使用大批量可以缓解这种情况,其中GPU可以有效消除不规则全局内存访问的高成本。; WARP修改了标准联合能源网格实现以减少内存流量。 WARP而不是存储按反应类型和能量索引的指针矩阵,而是存储三个矩阵。第一个包含横截面值,第二个包含指向角度分布的指针,第三个包含指向能量分布的指针。这种链接列表类型的布局会增加内存使用量,但通过消除查找横截面值的指针负载来减少确定反应所需的数据负载数量.WARP还使用了优化的高性能GPU代码库在任何可能的地方。 CUDA性能基元(CUDPP)库用于执行并行约简,排序和求和,CURAND库用于为线性同余随机数生成器提供种子,而OptiX射线跟踪框架用于几何图形表示。 OptiX是由NVIDIA开发的高度优化的库,该库自动围绕用户输入的几何体构建分层的加速结构,因此在射线跟踪中仅需要查询沿射线线的曲面。 WARP还使用类似多边形的算法通过OptiX执行材料和细胞数量查询。WARP表明GPU是执行具有连续能量截面的蒙特卡洛中子传输的有效平台。当前,WARP是用于在GPU上以常规3D几何形状执行连续能量蒙特卡洛中子传输的最详细,功能最丰富的程序,但是与诸如Serpent和MCNP的生产代码相比,WARP的功能有限。尽管WARP缺乏功能,但其新颖的算法实现方式表明,尽管程序流程固有地不同且数据访问模式稀疏,但仍可以在GPU上实现高性能。 WARP尚未准备好进行日常核反应堆计算,但它是进一步开发GPU加速的蒙特卡洛中子传输的良好平台。在当前状态下,它可能是用于乘数因子搜索的有用工具,即通过扰动材料密度或温度来确定反应系数,因为这些类型的计算通常不需要很多通量。 (摘要由UMI缩短。)。

著录项

  • 作者

    Bergmann, Ryan.;

  • 作者单位

    University of California, Berkeley.;

  • 授予单位 University of California, Berkeley.;
  • 学科 Engineering Nuclear.;Computer Science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 165 p.
  • 总页数 165
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号