首页> 外文会议>Reliable Distributed Systems, 1999. Proceedings of the 18th IEEE Symposium on >An efficient checkpointing algorithm for distributed systems implementing reliable communication channels
【24h】

An efficient checkpointing algorithm for distributed systems implementing reliable communication channels

机译:用于实现可靠通信通道的分布式系统的高效检查点算法

获取原文

摘要

This paper presents a new checkpointing algorithm that guarantees the semantics of reliable communication channels despite the crash and recovery of processes. This algorithm requires O(n+m) communication messages, where n is the number of participating processes, and m is the number of "late" messages. The algorithm is nonblocking, requires minimal message logging, and has minimal stable storage requirements. This algorithm is also scalable, simple transparent to the user, and facilitates fast recovery. By introducing suitable delay in the checkpointing process, the parameter m can be made small. We also describe a variant of the algorithm that requires only O(n) messages, at a cost of O(n) additional storage for each process.
机译:本文提出了一种新的检查点算法,该算法即使在进程崩溃和恢复的情况下仍可确保可靠通信通道的语义。此算法需要O(n + m)条通信消息,其中n是参与进程的数量,m是“最新”消息的数量。该算法是非阻塞的,需要最少的消息记录,并且具有最少的稳定存储要求。该算法也是可扩展的,对用户而言简单透明,并有助于快速恢复。通过在检查点处理中引入适当的延迟,可以减小参数m。我们还描述了该算法的一种变体,该变体仅需要O(n)条消息,但每个过程的开销为O(n)个额外的存储空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号