A mechanism, the recovery manager (RM), that supports the recovery data in a distributed system of workstations is presented. The recovery services provided by RM do not provide protection against media failures such as head crashes, but do support system and software crash recovery on workstations with limited disk storage. RM services are intended to support data recovery for long-lived operations such as imaging and numerical applications. The services discussed rely on the shadowing of data and support a two-phase commit protocol (2PC) in a distributed environment. The system model in which the RM services operate, RM itself, and the services it provides are described. The correctness criteria for RM services are defined. Work related to RM and the optimization of RM for long-lived operations are discussed. A proof of correctness of the services is given.
展开▼