Checkpointing and rollback recovery are widely used techniques to handle failures in distributed computing systems. Usually we avoid taking checkpoints that are useless during the recovery process. Communication-Induced checkpointing algorithms guarantee the usefulness of all the checkpoints and provide considerable autonomy with relatively low overhead. In this paper, we propose an enhanced Communication-Induced checkpointing algorithm. Our algorithm is likely to have less checkpointing overhead than an existing algorithm in the literature.
展开▼