Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

Ramanan Sankaran; Jordan Angel; W. Michael Brown

首页> 外文期刊>Concurrency and Computation >Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

【24h】

Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

机译：基于遗传算法的任务重排序以提高批量计划大规模并行科学应用程序的性能

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The growth in size of networked high performance computers along with novel accelerator-based nodernarchitectures has further emphasized the importance of communication efficiency in high performancerncomputing. The world’s largest high performance computers are usually operated as shared userrnfacilities due to the costs of acquisition and operation. Applications are scheduled for execution in arnshared environment and are placed on nodes that are not necessarily contiguous on the interconnect.rnFurthermore, the placement of tasks on the nodes allocated by the scheduler is sub-optimal, leadingrnto performance loss and variability. Here, we investigate the impact of task placement on the performancernof two massively parallel application codes on the Titan supercomputer, a turbulent combustionrnflow solver (S3D) and a molecular dynamics code (LAMMPS). Benchmark studies show a significantrndeviation from ideal weak scaling and variability in performance. The inter-task communicationrndistance was determined to be one of the significant contributors to the performance degradation andrnvariability. A genetic algorithm-based parallel optimization technique was used to optimize the taskrnordering. This technique provides an improved placement of the tasks on the nodes, taking intornaccount the application’s communication topology and the system interconnect topology. Applicationrnbenchmarks after task reordering through genetic algorithm show a significant improvement in performancernand reduction in variability, thereby enabling the applications to achieve better time to solutionrnand scalability on Titan during production.

机译：网络高性能计算机的规模增长以及基于加速器的新型节点体系结构，进一步强调了通信效率在高性能计算中的重要性。由于购置和运营的成本，世界上最大的高性能计算机通常作为共享的用户设施运行。应用程序计划在共享环境中执行，并且放置在互连上不一定连续的节点上。此外，任务在调度程序分配的节点上的放置次优，从而导致性能损失和可变性。在这里，我们研究了任务放置对Titan超级计算机上两个大规模并行应用程序代码，湍流燃烧流求解器（S3D）和分子动力学代码（LAMMPS）的性能的影响。基准研究表明，与理想的弱缩放和性能差异存在显着差异。任务间通信距离被确定为导致性能下降和可变性的重要因素之一。基于遗传算法的并行优化技术被用来优化任务排序。考虑到应用程序的通信拓扑和系统互连拓扑，此技术可改进任务在节点上的放置。通过遗传算法对任务进行重新排序后的应用基准测试显示，性能显着提高，可变性降低，从而使应用程序能够在生产过程中获得更好的时间来解决Titan上的可扩展性。

著录项

来源
《Concurrency and Computation》 |2015年第17期|4763-4783|共21页
作者
Ramanan Sankaran; Jordan Angel; W. Michael Brown;
展开▼
作者单位

Computational Scientist, Center for Computational Sciences, Oak RidgeNational Laboratory, PO Box 2008, MS 6008, Oak Ridge, TN 37831-6008, USA;

Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA;

Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
performance variability; task mapping; genetic algorithm; optimization; communication topology;

机译：性能可变性;任务映射;遗传算法;优化;通信拓扑;

相似文献

外文文献
中文文献
专利

1. Task scheduling, resource provisioning, and load balancing on scientific workflows using parallel SARSA reinforcement learning agents and genetic algorithm [J] . Asghari Ali, Sohrabi Mohammad Karim, Yaghmaee Farzin Journal of supercomputing . 2021,第3期

机译：使用并行Sarsa强化学习代理和遗传算法的科学工作流程任务调度，资源供应和负载平衡
2. Application research based on improved genetic algorithm in cloud task scheduling [J] . Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2020,第1aPta1期

机译：基于云任务调度改进遗传算法的应用研究
3. Application research based on improved genetic algorithm in cloud task scheduling [J] . Sun Yang, Li Jianrong, Fu Xueliang, Ecological restoration . 2020,第1期

机译：基于云任务调度改进遗传算法的应用研究
4. A scheduling model based on genetic algorithms for parallel and batched maintenance tasks [C] . Hui Yongmei, Zheng Huaizhou, Liu Jiani, International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering . 2013

机译：基于遗传算法的并行和批量维护任务调度模型
5. Scheduling parallel batch processing machines to minimize makespan using genetic algorithms. [D] . Hirani, Neal S. 2006

机译：使用遗传算法调度并行批处理机器以最小化制造时间。
6. Secure Scientific Applications Scheduling Technique for Cloud Computing Environment Using Global League Championship Algorithm [O] . Shafi’i Muhammad Abdulhamid, Muhammad Shafie Abd Latiff, Gaddafi Abdul-Salaam, -1

机译：使用全球联赛冠军算法的云计算环境安全科学应用调度技术
7. Genetic algorithm based scheduling of parallel batch machines with incompatible job families to minimize total weighted tardiness [O] . Hari Balasubramaniany, Michele Pfundy 2014

机译：基于遗传算法的具有不兼容作业系列的并行批处理机器的调度，以最小化总加权延迟

Genetic algorithm based task reordering to improve the performance of batch scheduled massively parallel scientific applications

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅