【24h】

A Different Re-execution Speed Can Help

机译:不同的重新执行速度可以有所帮助

获取原文
获取外文期刊封面目录资料

摘要

We consider divisible load scientific applications executing on large-scale platforms subject to silent errors. While the goal is usually to complete the execution as fast as possible in expectation, another major concern is energy consumption. The use of dynamic voltage and frequency scaling (DVFS) can help save energy, but at the price of performance degradation. Consider the execution model where a set of K different speeds is given, and whenever a failure occurs, a different re-execution speed may be used. Can this help? We address the following bi-criteria problem: how to compute the optimal checkpointing period to minimize energy consumption while bounding the degradation in performance. We solve this bi-criteria problem by providing a closed-form solution for the checkpointing period, and demonstrate via a comprehensive set of simulations that a different re-execution speed can indeed help.
机译:我们认为在大规模平台上执行的可分割负载科学应用程序会遭受静默错误。尽管通常目标是尽可能快地完成执行,但另一个主要问题是能耗。动态电压和频率缩放(DVFS)的使用可以帮助节省能源,但是会降低性能。考虑执行模型,其中给出了K个不同的速度,并且每当发生故障时,都可以使用不同的重新执行速度。能帮上忙吗?我们解决了以下双重标准问题:如何计算最佳检查点时间以最大程度地减少能耗,同时限制性能下降。我们通过为检查点时段提供封闭形式的解决方案来解决此双标准问题,并通过一组全面的模拟证明不同的重新执行速度确实可以提供帮助。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号