首页> 外文期刊>Journal of Parallel and Distributed Computing >Concurrent checkpoint initiation and recovery algorithms on asynchronous ring networks
【24h】

Concurrent checkpoint initiation and recovery algorithms on asynchronous ring networks

机译:异步环网上的并发检查点启动和恢复算法

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Checkpointing with rollback recovery is a well-known method for achieving fault-tolerance in distributed systems. In this work, we introduce algorithms for checkpointing and rollback recovery on asynchronous unidirectional and bi-directional ring networks. The proposed checkpointing algorithms can handle multiple concurrent initiations by different processes. While taking checkpoints, processes do not have to take into consideration any application message dependency. The synchronization is achieved by passing control messages among the processes. Application messages are acknowledged. Each process maintains a list of unacknowledged messages. Here we use a logical checkpoint, which is a standard checkpoint (i.e., snapshot of the process) plus a list of messages that have been sent by this process but are unacknowledged at the time of taking the checkpoint. The worst case message complexity of the proposed checkpointing algorithm is O(kn) when k initiators initiate concurrently. The time complexity is O(n). For the recovery algorithm, time and message complexities are both O(n).
机译:具有回滚恢复的检查点是用于实现分布式系统中的容错能力的一种众所周知的方法。在这项工作中,我们介绍了用于异步单向和双向环形网络上的检查点和回滚恢复的算法。所提出的检查点算法可以通过不同的过程处理多个并发启动。在采用检查点时,进程不必考虑任何应用程序消息依赖性。通过在进程之间传递控制消息来实现同步。应用消息被确认。每个进程都维护未确认消息的列表。在这里,我们使用逻辑检查点,这是一个标准检查点(即流程的快照)以及该流程已发送但在接受该检查点时未确认的消息列表。当同时发起k个发起者时,所提出的检查点算法的最坏情况消息复杂度为O(kn)。时间复杂度为O(n)。对于恢复算法,时间和消息复杂度均为O(n)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号