In this paper, a pseudo-synchronous approach for checkpointing/recovery is proposed using only basic checkpoints. The direct-dependency concept used in communication-induced approaches has been applied to basic checkpoints to design a simple algorithm to find a consistent global checkpoint. Also, the use of the concept of forced checkpoints ensures a small re-execution time after recovery from a failure. The proposed approach enjoys the advantages of both synchronous and asynchronous approaches, i.e. simple recovery and simple way to create checkpoints. Besides, direct-dependency concept is implemented without piggybacking any extra information with the application message.
展开▼