首页> 外文期刊>Operating systems review >Performance Driven Multi-Objective Distributed Scheduling for Parallel Computations
【24h】

Performance Driven Multi-Objective Distributed Scheduling for Parallel Computations

机译:性能驱动的多目标分布式并行计算调度

获取原文
获取原文并翻译 | 示例
           

摘要

With the advent of many-core architectures and strong need for Petascale (and Exascale) performance in scientific domains and industry analytics, efficient scheduling of parallel computations for higher productivity and performance has become very important. Further, movement of massive amounts (Terabytes to Petabytes) of data is very expensive, which necessitates affinity driven computations. Therefore, distributed scheduling of parallel computations on multiple places needs to optimize multiple performance objectives: follow affinity maximally and ensure efficient space, time and message complexity. Simultaneous consideration of these objectives makes distributed scheduling a particularly challenging problem. In addition, parallel computations have data dependent execution patterns which requires online scheduling to effectively optimize the computation orchestration as it unfolds. This paper presents an online algorithm for affinity driven distributed scheduling of multi-place parallel computations. To optimize multiple performance objectives simultaneously, our algorithm uses a low time and message complexity mechanism for ensuring affinity and a randomized work-stealing mechanism within places for load balancing. Theoretical analysis of the expected and probabilistic lower and upper bounds on time and message complexity of this algorithm has been provided. On multi-core clusters such as Blue Gene/P (MPP architecture) and Intel multi-core cluster, we demonstrate performance close to the custom MPI+Pthreads code. Further, strong, weak and data (increasing input data size) scalability have been demonstrated on multi-core clusters. Using well known benchmarks, we demonstrate 16% to 30% performance gain as compared to Cilk [6] on multi-core Intel Xeon 5570 (NUMA) architecture. Detailed experimental analysis illustrates efficient space (main memory) utilization as well. To the best of our knowledge, this is the first time multi-objective affinity driven distributed scheduling algorithm has been designed, theoretically analyzed and experimentally evaluated in a multi-place setup for multi-core cluster architectures.
机译:随着多核体系结构的出现以及在科学领域和行业分析中对Petascale(和Exascale)性能的强烈需求,有效调度并行计算以提高生产率和性能变得非常重要。此外,海量数据(兆字节至PB)的移动非常昂贵,这需要亲和力驱动的计算。因此,在多个位置进行并行计算的分布式调度需要优化多个性能目标:最大程度地遵循亲和力并确保有效的空间,时间和消息复杂性。同时考虑这些目标使分布式调度成为一个特别具有挑战性的问题。此外,并行计算具有与数据相关的执行模式,该模式需要在线调度才能在展开时有效地优化计算流程。本文提出了一种在线算法,用于多场所并行计算的相似性驱动的分布式调度。为了同时优化多个性能目标,我们的算法使用了低时间和消息复杂性机制来确保亲和力,并在负载均衡的地方采用了随机的工作窃取机制。对该算法的时间和消息复杂度的预期和概率上下限进行了理论分析。在Blue Gene / P(MPP体系结构)和Intel多核群集等多核群集上,我们展示了接近自定义MPI + Pthreads代码的性能。此外,已经在多核群集上证明了强大,弱小的和数据(增加输入数据大小)可伸缩性。使用众所周知的基准测试,与多核Intel Xeon 5570(NUMA)架构上的Cilk [6]相比,我们证明了16%到30%的性能提升。详细的实验分析还说明了有效的空间(主内存)利用率。据我们所知,这是首次在多核集群体系结构的多地点设置中设计,理论分析和实验评估多目标相似性驱动的分布式调度算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号