首页> 外文期刊>Journal of Parallel and Distributed Computing >Fault-tolerant scheduling on parallel systems with non-memoryless failure distributions
【24h】

Fault-tolerant scheduling on parallel systems with non-memoryless failure distributions

机译:具有非无内存故障分布的并行系统上的容错调度

获取原文
获取原文并翻译 | 示例

摘要

As large parallel systems increase in size and complexity, failures are inevitable and exhibit complex space and time dynamics. Most often, in real systems, failure rates are increasing or decreasing over time. Considering non-memoryless failure distributions, we study a bi-objective scheduling problem of optimizing application makespan and reliability. In particular, we determine whether one can optimize both makespan and reliability simultaneously, or whether one metric must be degraded in order to improve the other. We also devise scheduling algorithms for achieving (approximately) optimal makespan or reliability. When failure rates decrease, we prove that makespan and reliability are opposing metrics. In contrast, when failure rates increase, we prove that one can optimize both makespan and reliability simultaneously. Moreover, we show that the largest processing time (LPT) list scheduling algorithm achieves good performance when processors are of uniform speed. The implications of our findings are the accelerated completion and improved reliability of parallel jobs executed across large distributed systems. Finally, we conduct simulations to investigate the impact of failures on the performance, which is done using an actual application of biological sequence comparison.
机译:随着大型并行系统尺寸和复杂性的增加,不可避免的会出现故障,并显示出复杂的时空动态。大多数情况下,在实际系统中,故障率会随着时间的推移而增加或减少。考虑到非内存故障分布,我们研究了优化应用制造时间和可靠性的双目标调度问题。特别是,我们确定是否可以同时优化制造期和可靠性,或者是否必须降低一项指标以改善另一项指标。我们还设计了调度算法来实现(大约)最佳的制造期或可靠性。当故障率降低时,我们证明了制造时间和可靠性是相反的指标。相反,当故障率增加时,我们证明可以同时优化制造时间和可靠性。此外,我们表明,当处理器速度一致时,最大处理时间(LPT)列表调度算法可获得良好的性能。我们的发现的含义是跨大型分布式系统执行的并行作业的加速完成和提高的可靠性。最后,我们进行模拟以研究故障对性能的影响,这是使用生物序列比较的实际应用来完成的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号