【24h】

Efficient Online Scheduling for Deadline-Sensitive Jobs

机译:有效的在线计划,适用于对截止日期敏感的工作

获取原文
获取原文并翻译 | 示例

摘要

We consider mechanisms for online deadline-aware scheduling in large computing clusters. Batch jobs that run on such clusters often require guarantees on their completion time (i.e., deadlines). However, most existing scheduling systems implement fair-share resource allocation between users, an approach that ignores heterogeneity in job requirements and may cause deadlines to be missed. In our framework, jobs arrive dynamically and are characterized by their value and total resource demand (or estimation thereof), along with their reported deadlines. The scheduler's objective is to maximize the aggregate value of jobs completed by their deadlines. We circumvent known lower bounds for this problem by assuming that the input has slack, meaning that any job could be delayed and still finish by its deadline. Under the slackness assumption, we design a preemptive scheduler with a constant-factor worst-case performance guarantee. Along the way, we pay close attention to practical aspects, such as runtime efficiency, data locality and demand uncertainty. We evaluate the algorithm via simulations over real job traces taken from a large production cluster, and show that its actual performance is significantly better than other heuristics used in practice. We then extend our framework to handle provider commitments: the requirement that jobs admitted to service must be executed until completion. We prove that no algorithm can obtain worst-case guarantees when enforcing the commitment decision to the job arrival time. Nevertheless, we design efficient heuristics that commit on job admission, in the spirit of our basic algorithm. We show empirically that these heuristics perform just as well as (or better than) the original algorithm. Finally, we discuss how our scheduling framework can be used to design truthful scheduling mechanisms, motivated by applications to commercial public cloud offerings.
机译:我们考虑大型计算集群中用于在线截止日期感知调度的机制。在此类集群上运行的批处理作业通常需要保证其完成时间(即期限)。但是,大多数现有的调度系统都实现了用户之间的公平共享资源分配,这种方法忽略了工作要求的异质性,并可能导致错过最后期限。在我们的框架中,作业是动态到达的,并以其价值和总资源需求(或其估计)以及报告的截止日期为特征。调度程序的目标是在截止日期之前最大化作业的总价值。我们通过假设输入松弛来规避此问题的已知下限,这意味着任何作业都可能被延迟,并且仍然在其截止日期之前完成。在松弛假设下,我们设计了具有恒定因子最坏情况性能保证的抢占式调度程序。在此过程中,我们密切关注实际方面,例如运行时效率,数据局部性和需求不确定性。我们通过对大型生产集群中的实际工作轨迹进行仿真来评估该算法,并表明该算法的实际性能明显优于实际中使用的其他启发式算法。然后,我们扩展框架以处理提供者的承诺:必须执行接受服务的作业的要求,直到完成。我们证明,在执行对工作到达时间的承诺决策时,没有算法能够获得最坏情况的保证。尽管如此,我们还是本着基本算法的精神设计了有效的启发式方法,以求职。我们凭经验表明,这些启发式算法的性能与原始算法一样好(或更好)。最后,我们讨论了如何使用我们的调度框架来设计真实的调度机制,该机制受应用到商业公共云产品的推动。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号