首页> 外文会议>IEEE International Parallel Distributed Processing Symposium >Adaptive Incremental Checkpointing via Delta Compression for Networked Multicore Systems
【24h】

Adaptive Incremental Checkpointing via Delta Compression for Networked Multicore Systems

机译:网络多核系统通过增量压缩的自适应增量检查点

获取原文

摘要

Check pointing has been widely adopted in support of fault-tolerance and job migration, with checkpoint files preferably kept also at remote storage to withstand unavailability/failures of local nodes in networked systems. Lately, I/O bandwidth to remote storage becomes the bottleneck for check pointing on a large-scale system. This paper proposes an adaptive incremental check pointing (AIC), aiming to reduce the check pointing file size considerably so that its involved overhead is lowered and thus the expected job turnaround time drops. Given production multicore systems are observed to have unused cores often available, we design AIC to make use of separate cores for carrying out multi-level check pointing with delta compression at desirable points of time adaptively. We develop a new Markov model for predicting the performance of such multi-level concurrent check pointing, with AIC performance evaluated using six SPEC benchmarks under various system sizes. AIC is observed to lower the normalized expected turnaround time substantially (by up to 47%) when compared to its static counterpart and a recent multi-level check pointing scheme with fixed checkpoint intervals.
机译:检查点已被广泛采用以支持容错和作业迁移,检查点文件最好也保存在远程存储中,以承受网络系统中本地节点的不可用/故障。最近,到远程存储的I / O带宽成为大型系统上检查点的瓶颈。本文提出了一种自适应增量检查指向(AIC),旨在大幅减少检查指向文件的大小,从而降低其所涉及的开销,从而减少预期的作业周转时间。考虑到生产的多核系统经常有未使用的核,我们设计AIC来利用单独的核,在所需的时间点自适应地执行增量压缩的多级检查点。我们开发了一种新的马尔可夫模型来预测这种多级并发检查点的性能,并使用六个SPEC基准在各种系统大小下对AIC性能进行了评估。与AIC的静态对应物和最近的具有固定检查点间隔的多级检查指示方案相比,AIC可以显着降低标准化的预期周转时间(最多降低47%)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号