Rollback recovery offers an efficient method of recovering from transient faults and permanent faults when rollback is combined with spare resource reconfiguration. The hardware macro-rollback technique presented has been implemented in the advanced fault-tolerant data processor (AFTDP), which is a high-performance fault-tolerant shared memory multiprocessor. The architecture discussion focuses on the unique problems of achieving both low overhead and fast recovery in high-throughput cached multiprocessors. Macro-rollback recovery from transient faults and hard macro-rollback from permanent faults are examined. In addition, deadline analysis based on a semi-Markov model is presented.
展开▼