首页> 外文期刊>IEEE Transactions on Computers >Risk-resilient heuristics and genetic algorithms for security-assured grid job scheduling
【24h】

Risk-resilient heuristics and genetic algorithms for security-assured grid job scheduling

机译:风险弹性启发式算法和遗传算法,用于安全保障的网格作业调度

获取原文
获取原文并翻译 | 示例

摘要

In scheduling a large number of user jobs for parallel execution on an open-resource grid system, the jobs are subject to system failures or delays caused by infected hardware, software vulnerability, and distrusted security policy. This paper models the risk and insecure conditions in grid job scheduling. Three risk-resilient strategies, preemptive, replication, and delay-tolerant, are developed to provide security assurance. We propose six risk-resilient scheduling algorithms to assure secure grid job execution under different risky conditions. We report the simulated grid performances of these new grid job scheduling algorithms under the NAS and PSA workloads. The relative performance is measured by the total job makespan, grid resource utilization, job failure rate, slowdown ratio, replication overhead, etc. In addition to extending from known scheduling heuristics, we developed a new space-time genetic algorithm (STGA) based on faster searching and protected chromosome formation. Our simulation results suggest that, in a wide-area grid environment, it is more resilient for the global job scheduler to tolerate some job delays instead of resorting to preemption or replication or taking a risk on unreliable resources allocated. We find that delay-tolerant min-min and STGA job scheduling have 13-23 percent higher performance than using risky or preemptive or replicated algorithms. The resource overheads for replicated job scheduling are kept at a low 15 percent. The delayed job execution is optimized with a delay factor, which is 20 percent of the total makespan. A Kiviat graph is proposed for demonstrating the quality of grid computing services. These risk-resilient job scheduling schemes can upgrade grid performance significantly at only a moderate increase in extra resources or scheduling delays in a risky grid computing environment.
机译:在调度大量用户作业以在开放资源网格系统上并行执行时,这些作业可能会由于受感染的硬件,软件漏洞和不受信任的安全策略而导致系统故障或延迟。本文对网格作业调度中的风险和不安全条件进行建模。为了提供安全保证,开发了三种具有风险抵抗力的策略:抢占式,复制式和容错式。我们提出了六种风险弹性调度算法,以确保在不同风险条件下安全地执行网格作业。我们报告了这些新的网格作业调度算法在NAS和PSA工作负载下的仿真网格性能。相对性能由总的作业完成时间,网格资源利用率,作业失败率,减速比,复制开销等来衡量。除了从已知的调度启发式算法扩展之外,我们还基于以下方法开发了新的时空遗传算法(STGA):更快的搜索速度和受保护的染色体形成。我们的模拟结果表明,在广域网格环境中,全局作业调度程序可以忍受某些作业延迟,而不是求助于抢占或复制或冒着分配不可靠资源的风险,因此更具弹性。我们发现,与使用风险性,抢占式或复制算法相比,容忍的最小分钟数和STGA作业调度的性能提高了13-23%。复制作业调度的资源开销保持在15%的较低水平。延迟的作业执行通过延迟因子进行了优化,该延迟因子占总工期的20%。提出了Kiviat图,以证明网格计算服务的质量。这些风险灵活的作业调度方案仅在风险网格计算环境中仅适度增加额外资源或调度延迟的情况下,才能显着提升网格性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号