首页>
外国专利>
METHOD FOR SOFTWARE ERROR RECOVERY USING CONSISTENT GLOBAL CHECKPOINTS
METHOD FOR SOFTWARE ERROR RECOVERY USING CONSISTENT GLOBAL CHECKPOINTS
展开▼
机译:使用一致的全局检查点进行软件错误恢复的方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
Disclosed is a method for error recovery in a multiprocessing computer system of the typein which each of the processes periodically takes checkpoints. In the event of a failure. a processcan be rolled back to a prior checkpoint, and execution can continue from the checkpointed state.A monitor process monitors the execution of the processes. Upon the occurrence of a failure, atarget set of checkpoints is identified, and the maximum consistent global checkpoint, whichincludes the target set of checkpoints, is computed. Each of the processes is rolled back to anassociated checkpoint in the consistent global checkpoint. Upon a subsequent occurrence of thesame failure, a second set of checkpoints is identified, and the minimum consistent globalcheckpoint, which includes the target set of checkpoints, is computed. Each of the processes isrolled back to an associated checkpoint in the consistent global checkpoint. Upon anotheroccurrence of the same failure, the system is rolled back further to a coordinated checkpoint. Alsodisclosed are novel methods for calculating the minimum and maximum consistent globalcheckpoints. In accordance with one embodiment, the minimum and maximum consistent globalcheckpoints are calculated by a central process. In accordance with another embodiment, theminimum and maximum consistent global checkpoints are calculated in a distributed fashion byeach of the individual processes.
展开▼