摘要

It is well-known that taking into account communications while scheduling jobs in large scale parallel computing platforms is a crucial issue. In modern hierarchical platforms, communication times are highly different when occurring inside a cluster or between clusters. Thus, allocating the jobs taking into account locality constraints is a key factor for reaching good performances. However, several theoretical results prove that imposing such constraints reduces the solution space and thus, possibly degrades the performances. In practice, such constraints simplify implementations and most often lead to better results. Our aim in this work is to bridge theoretical and practical intuitions, and check the differences between constrained and unconstrained schedules (namely with respect to locality and node contiguity) through simulations. We have developed a generic tool, using SimGrid as the base simulator, enabling interactions with external batch schedulers to evaluate their scheduling policies. The results confirm that insights gained through theoretical models are ill-suited to current architectures and should be reevaluated.
机译:众所周知,在大型并行计算平台中调度作业时考虑通信是一个关键问题。在现代分层平台中,当发生在集群内部或集群之间时,通信时间会大大不同。因此,在考虑到局部性约束的情况下分配工作是获得良好性能的关键因素。但是,一些理论结果证明,施加这样的约束会减少解决方案空间,因此可能会降低性能。实际上,这种约束简化了实现,并且通常会带来更好的结果。我们在这项工作中的目的是弥合理论和实践的直觉,并通过模拟检查约束时间表和不受约束时间表之间的差异(即关于局部性和节点连续性)。我们已经开发了一种通用工具,使用SimGrid作为基础模拟器,可以与外部批处理计划程序进行交互以评估其计划策略。结果证实,通过理论模型获得的见解不适合当前的体系结构,应重新评估。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号