首页> 外文会议>International Conference on Cluster Computing >A Runtime Heuristic to Selectively Replicate Tasks for Application-Specific Reliability Targets
【24h】

A Runtime Heuristic to Selectively Replicate Tasks for Application-Specific Reliability Targets

机译:一个运行时启发式,可以选择性地复制特定于应用程序可靠性目标的任务

获取原文

摘要

In this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require modification/recompilation of OS, compiler or application code. Our heuristic, we call App_FIT, selects tasks to replicate such that the specified reliability target for an application is achieved. In our experimental evaluation, we show that App FIT selective replication heuristic is low-overhead and highly scalable. In addition, results indicate that complete task replication is overkill for achieving reliability targets. We show that with App FIT, we can tolerate pessimistic exascale error rates with only 53% of the tasks being replicated.
机译:在本文中,我们提出了一种基于运行的选择性任务复制技术,用于任务并行高性能计算应用程序。我们的选择性任务复制技术是自动的,不需要修改/重新编译OS,编译器或应用程序代码。我们的启发式,我们调用app_fit,选择复制的任务,以实现应用程序的指定可靠性目标。在我们的实验评估中,我们表明应用程序适合选择性复制启发式是低开销和高度可扩展的。此外,结果表明,完成任务复制是实现可靠性目标的矫枉过正。我们展示了使用应用程序的契合,我们可以容忍悲观的ExaScale误差率,只有53%的任务被复制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号