首页> 外文会议>International Topical Meeting on Nuclear Reactor Thermal Hydraulics >DEVELOPING A MINI-APP FOR EXPLORING ALGORITHMS FOR UNSTRUCTURED MESH DETERMINISTIC DISCRETE ORDINATES TRANSPORT ON MANY-CORE ARCHITECTURES
【24h】

DEVELOPING A MINI-APP FOR EXPLORING ALGORITHMS FOR UNSTRUCTURED MESH DETERMINISTIC DISCRETE ORDINATES TRANSPORT ON MANY-CORE ARCHITECTURES

机译:开发用于探索非结构化网格确定性离散的算法的迷你应用程序在许多核心架构上运输

获取原文

摘要

Recent trends in computational architecture design are yielding processors with deep andcomplex memory hierarchies consisting of small capacity caches and large capacity mainmemory. CPU parallelism is also hierarchical, consisting of SIMD vector units containedwithin multiple computational cores with one or more packages in a multi-socket system.Solving the deterministic discrete ordinates transport equation effectively on thesearchitectures requires extracting and effectively mapping concurrent work to the processingelements to leverage performance close to the maximum attainable. This challengebecomes more acute when an unstructured spatial domain is required, where the sweepdependency between neighbouring spatial cells/elements is not implicit as for a structuredgrid. In this paper we introduce the transport community to the UnSNAP mini-app,a port of the well known SNAP proxy application. UnSNAP was developed to investigatethe performance of arbitrarily high-order discontinuous Galerkin finite element unstructureddeterministic transport codes on advanced architectures. Approaches to local matrixassembly and solution are evaluated in order to assess their performance for different elementorders, and discuss the trade-offs with respect to performance and memory capacitylimits of advanced architectures. The performance limiting factors will be explored onmany-core architectures, including CPUs from Intel, AMD and Marvell (Arm). We willalso discuss performing unstructured sweeps on GPU devices highlighting the associatedchallenges.
机译:近期计算架构设计的趋势是屈服深层的处理器复杂的记忆层次结构包括小容量高速缓存和大容量主记忆。 CPUParpleSism也是分层的,由包含的SIMD向量单元组成在多套接字系统中具有一个或多个包的多个计算核心内。解决确定性离散坐在这些上的传输方程架构需要提取和有效地映射并发工作到处理杠杆性能接近最大可达到的元素。这项挑战当需要非结构化的空间域时变得更加急剧,扫描相邻空间单元/元素之间的依赖性不是结构化的网格。在本文中,我们将运输社区介绍给Unsnap Mini-app,众所周知的Snap代理应用程序的端口。没有纳带开发了调查任意高阶不连续Galerkin有限元的性能非结构化先进架构上的确定性传输代码。对本地矩阵的方法评估组装和解决方案以评估它们对不同元素的性能订单,并讨论绩效和记忆能力的权衡高级架构的限制。将探索性能限制因素许多核心架构,包括来自英特尔,AMD和Marvell(ARM)的CPU。我们会还讨论在突出相关联的GPU设备上执行非结构化扫描挑战。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号