首页> 外文会议>ACM/IEEE conference on Supercomputing >Topology preserving dynamic load balancing for parallel molecular simulations
【24h】

Topology preserving dynamic load balancing for parallel molecular simulations

机译:用于并行分子模拟的保留动态负载平衡的拓扑

获取原文

摘要

Understanding the behavior of molecular systems such as DNA and proteins is a central problem in the field of physical chemistry. Most research laboratories now have a collection of heterogeneous processing architectures which could be brought to bear on this problem - high end PCs, various different workstations, and time on high performance machines [8, 11, 20]. Much work is currently being done on programming tools to exploit the different architectures, but most of this work to date is aimed at the programmer, who must spend time learning how to use message passing or parallel object oriented languages. The research scientist wishes simply to run their simulations as fast as possible. The work of our group includes the design of an environment[18] for the production of parallel code implementing the simulation of complex polymers, proteins and DNA molecules.The problems encountered in simulations of polymers, DNA and proteins often have straightforward parallel realizations, since the domain can be decomposed either spatially or along the molecular chain, and then a simple extension of the serial algorithm applied (at least for molecular mechanics - Monte Carlo methods need a little more work to ensure the criterion of detailed balance is adhered to) [5, 6, 19]. However, since the problems are completely dynamic in nature, any such decomposition will lead to an inefficient parallelization after sometime. Dynamic load balancing must then be applied to increase a simulations efficiency.To resolve the problem several load-balancing techniques have been proposed for SIMD [2, 16] and MIMD [12, 21,23, 25] models. We can distinguish between static and dynamic techniques. In contrast to static load balancing, where the decision as to which processors the workload is allocated is fixed at compilation, dynamic load balancing takes the behavior of the application and the system characteristics into account to distribute fairly the workload among the available processors. Dynamic load balancing is also characterized by the manner in which data are exchanged and controlled. The control can be centralized [22, 4] or distributed [14, 28, 7]. Distributed load balancing strategies are incorporated to each node of the system so that each can make decisions independently of the others.The control phase is characterized by the collection of information on the state of the system, such as the load on a processor, the number of idle processors, etc., and the decision making (whether the system is balanced or not). This phase is often absent in dynamic distribution strategies for data-parallel applications [9, 10, 17, 24]. The distribution phase is the main body of all algorithms which are generally based on the following criteria: 1) fair distribution of load, 2) minimization of communication introduced by the new distribution, 3) efficiency of the distribution procedure for data migrating from overloaded nodes to under-loaded (efficient routing algorithm), 4) respecting the topology of the application, and 5) simplicityof the algorithm to reduce to a minimum the execution time of the algorithm.Problems in the area of molecular simulation, among others, can be characterized by the fact that some part of the problem topology remains fixed at least for large parts of the simulation. Chemically bonded monomer groups remain close to their neighbors for large parts of the simulation. Load balancing techniques that redistribute the problem domain without taking this fact into account lead to increased levels of communication, and to a reduction in efficiency.Our work is in developing topology preserving load balancing algorithms for MIMD machines, and in this paper we focus on the analysis and performance of one such algorithm, Positional Scan Load Balancing, and its suitability for use in the runtime system of the parallel simulation environment.
机译:了解诸如DNA和蛋白质之类的分子系统的行为是物理化学领域的核心问题。现在,大多数研究实验室都拥有可以处理这个问题的异构处理体系结构的集合-高端PC,各种不同的工作站以及高性能计算机上的时间[8、11、20]。当前,在利用不同体系结构的编程工具方面正在进行许多工作,但是迄今为止,大多数工作是针对程序员的,他们必须花时间学习如何使用消息传递或并行的面向对象语言。研究科学家只希望尽可能快地运行他们的模拟。我们小组的工作包括设计环境[18],以生产用于实现复杂聚合物,蛋白质和DNA分子模拟的并行代码。在聚合物,DNA和蛋白质的模拟中遇到的问题通常具有直接的并行实现,因为该域可以在空间上或沿着分子链分解,然后对应用的串行算法进行简单扩展(至少对于分子力学而言,蒙特卡洛方法需要做更多的工作才能确保遵守详细平衡的标准)[ 5、6、19]。但是,由于这些问题本质上是完全动态的,因此任何此类分解将在一段时间后导致效率低下的并行化。然后必须应用动态负载平衡以提高仿真效率。为解决该问题,已针对SIMD [2,16]和MIMD [12,21,23,25]模型提出了几种负载平衡技术。我们可以区分静态技术和动态技术。与静态负载平衡相反,静态负载平衡是在编译时确定分配工作负载的决策,而动态负载平衡则考虑了应用程序的行为和系统特性,以便在可用处理器之间公平地分配工作负载。动态负载平衡的特征还在于交换和控制数据的方式。控件可以是集中式[22,4]或分布式[14,28,7]。分布式负载平衡策略已集成到系统的每个节点,因此每个节点都可以独立于其他决策。控制阶段的特征是收集有关系统状态的信息,例如处理器的负载,数量。空闲处理器等,以及决策(系统是否平衡)。对于数据并行应用程序,动态分配策略中通常不存在该阶段[9、10、17、24]。分发阶段是所有算法的主体,这些算法通常基于以下标准:1)公平分配负载; 2)最小化新分发带来的通信; 3)分发过程的效率,以便从过载节点迁移数据到负载不足(高效路由算法),4)尊重应用程序的拓扑结构,5)简化算法以将算法的执行时间减少到最低限度。其特点是问题拓扑的某些部分至少在大部分仿真中保持固定。在大部分模拟中,化学键合的单体基团保持彼此相邻。不考虑这一事实而重新分配问题域的负载平衡技术会导致通信水平提高和效率降低。我们的工作是为MIMD机器开发保留拓扑的负载平衡算法,在本文中,我们将重点放在位置扫描负载平衡这样一种算法的分析和性能,及其在并行仿真环境的运行时系统中的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号