...
首页> 外文期刊>International Journal of High Performance Computing Applications >Understanding Checkpointing Overheads on Massive-Scale Systems: Analysis of the IBM Blue Gene/P System
【24h】

Understanding Checkpointing Overheads on Massive-Scale Systems: Analysis of the IBM Blue Gene/P System

机译:了解大规模系统上的检查点开销:IBM Blue Gene / P系统分析

获取原文
获取原文并翻译 | 示例

摘要

Providing fault tolerance in high-end petascale systems, consisting of millions of hardware components and complex software stacks, is becoming an increasingly challenging task. Checkpointing continues to be the most prevalent technique for providing fault tolerance in such high-end systems. Considerable research has focussed on optimizing checkpointing; however, in practice, checkpointing still involves a high-cost overhead for users. In this paper, we study the checkpointing overhead seen by various applications running on leadership-class machines like the IBM Blue Gene/P at Argonne National Laboratory. In addition to studying popular applications, we design a methodology to help users understand and intelligently choose an optimal checkpointing frequency to reduce the overall checkpointing overhead incurred. In particular, we study the Grid-Based Projector-Augmented Wave application, the Carr-Parrinello Molecular Dynamics application, the Nek5000 computational fluid dynamics application and the Parallel Ocean Program application-and analyze their memory usage and possible checkpointing trends on 65,536 processors of the Blue Gene/P system.
机译:在由数百万个硬件组件和复​​杂软件堆栈组成的高端petascale系统中提供容错能力正变得越来越具有挑战性。在这种高端系统中,检查点仍然是提供容错能力的最普遍技术。大量研究集中在优化检查点上。但是,实际上,检查点仍然会给用户带来高昂的开销。在本文中,我们研究了在Argonne国家实验室的IBM Blue Gene / P等领导级机器上运行的各种应用程序所遇到的检查点开销。除了研究流行的应用程序之外,我们还设计了一种方法来帮助用户理解并智能地选择最佳检查点频率,以减少产生的总体检查点开销。特别是,我们研究了基于网格的投影仪增强波应用程序,Carr-Parrinello分子动力学应用程序,Nek5000计算流体动力学应用程序和Parallel Ocean Program应用程序,并分析了它们在65,536个处理器中的内存使用情况和可能的检查点趋势。蓝色Gene / P系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号