首页> 外文期刊>Parallel Computing >pointerchain: Tracing pointers to their roots - A case study in molecular dynamics simulations
【24h】

pointerchain: Tracing pointers to their roots - A case study in molecular dynamics simulations

机译:指针链:追溯指针的根源-分子动力学模拟中的案例研究

获取原文
获取原文并翻译 | 示例

摘要

As scientific frameworks become sophisticated, so do their data structures. A data structure typically includes pointers and arrays to other structures in order to preserve application's state. In order to ensure data consistency from a scientific application on a modern high performance computing (HPC) architecture, the management of such pointers on the host and the device, has become complicated in terms of memory allocations because they occupy separate memory spaces. It becomes so severe that one must go through a chain of pointers to extract the effective address. In this paper, we propose to reduce the need of excessive data transfer by introducing the idea of pointerchain, a directive that replaces the pointer chains with their corresponding effective address inside the parallel region of a code. Based on our analysis, pointerchain leads to a 39% and 38% reduction in the amount of generated codes and the total executed instructions, respectively.With pointerchain, we have parallelized CoMD, a Molecular Dynamics (MD) proxy application on heterogeneous HPC architectures while maintaining a single portable codebase. This portable codebase utilizes OpenACC, an emerging directive-based programming model, to address the need of memory allocations from three computational kernels in CoMD. Two of the three embarrassingly parallel kernels highly benefit from OpenACC and perform better than the hand-written CUDA counterparts. The third kernel performed 61% of peak performance of its CUDA counterpart. The three kernels are common modules in any MD simulations. Our findings provides useful insights into parallelizing legacy MD software across heterogeneous platforms. (C) 2019 Elsevier B.V. All rights reserved.
机译:随着科学框架变得越来越复杂,其数据结构也越来越复杂。数据结构通常包括指向其他​​结构的指针和数组,以保留应用程序的状态。为了确保来自现代高性能计算(HPC)架构上的科学应用程序的数据一致性,在主机和设备上,此类指针的管理在内存分配方面变得很复杂,因为它们占用了单独的内存空间。它变得如此严重,以至于必须经过一系列指针来提取有效地址。在本文中,我们建议通过引入指针链(pointerchain)的想法来减少过多的数据传输,该指令将代码的并行区域内的指针链替换为其对应的有效地址。根据我们的分析,指针链分别将生成的代码量和总执行指令减少了39%和38%。借助指针链,我们已经并行化了CoMD,这是一种异构HPC架构上的分子动力学(MD)代理应用程序,而维护单个可移植代码库。这种便携式代码库利用OpenACC(一种新兴的基于指令的编程模型)来满足CoMD中三个计算内核的内存分配需求。三个令人尴尬的并行内核中的两个内核从OpenACC中受益匪浅,并且性能优于手写CUDA。第三核的性能是其CUDA同类产品的61%。这三个内核是任何MD模拟中的通用模块。我们的发现为跨异构平台并行化传统MD软件提供了有用的见解。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号