【24h】

Efficient Online Scheduling for Deadline-Sensitive Jobs

机译:有效的在线计划截止日期敏感的工作

获取原文

摘要

We consider mechanisms for online deadline-aware scheduling in large computing clusters. Batch jobs that run on such clusters often require guarantees on their completion time (i.e., deadlines). However, most existing scheduling systems implement fair-share resource allocation between users, an approach that ignores heterogeneity in job requirements and may cause deadlines to be missed. In our framework, jobs arrive dynamically and are characterized by their value and total resource demand (or estimation thereof), along with their reported deadlines. The scheduler's objective is to maximize the aggregate value of jobs completed by their deadlines. We circumvent known lower bounds for this problem by assuming that the input has slack, meaning that any job could be delayed and still finish by its deadline. Under the slackness assumption, we design a preemptive scheduler with a constant-factor worst-case performance guarantee. Along the way, we pay close attention to practical aspects, such as runtime efficiency, data locality and demand uncertainty. We evaluate the algorithm via simulations over real job traces taken from a large production cluster, and show that its actual performance is significantly better than other heuristics used in practice. We then extend our framework to handle provider commitments: the requirement that jobs admitted to service must be executed until completion. We prove that no algorithm can obtain worst-case guarantees when enforcing the commitment decision to the job arrival time. Nevertheless, we design efficient heuristics that commit on job admission, in the spirit of our basic algorithm. We show empirically that these heuristics perform just as well as (or better than) the original algorithm. Finally, we discuss how our scheduling framework can be used to design truthful scheduling mechanisms, motivated by applications to commercial public cloud offerings.
机译:我们考虑在大型计算集群中在线截止日期感知调度的机制。在此类集群上运行的批处理通常需要保证完成时间(即截止日期)。然而,大多数现有的调度系统在用户之间实施公平份额资源分配,这一方法忽略了工作要求中的异质性,并且可能导致截止日期。在我们的框架中,就业机会动态到达,其特点是它们的价值和总资源需求(或其估计)以及他们报告的截止日期。调度程序的目标是最大限度地提高其截止日期所完成的作业的总值。我们假设输入有懈怠,我们将在这个问题上绕过这个问题的已知下限,这意味着任何工作都可以延迟并且仍然截止到截止日期。在松懈的假设下,我们设计了一种具有恒定因素最坏情况性能保证的抢占调度程序。沿途,我们密切关注实际方面,如运行时间效率,数据点和需求不确定性。我们评估通过模拟了从一个大的生产集群采取真正的工作痕迹算法,并证明它的实际性能显著优于在实践中使用的其他启发式。然后,我们将框架扩展到处理提供商承诺:必须执行录取的工作的要求,直到完成。我们证明,在对工作到达时间执行承诺决定时,没有算法可以获得最坏情况。尽管如此,我们以基本算法的精神设计求职的高效启发式机构。我们凭经验展示这些启发式效果和(或优于)原始算法。最后,我们讨论了我们的调度框架如何用于设计真实的调度机制,由应用于商业公共云产品的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号