首页>
外国专利>
FAILURE RECOVERY FOR TRANSPLANTING ALGORITHMS FROM CLUSTER TO CLOUD
FAILURE RECOVERY FOR TRANSPLANTING ALGORITHMS FROM CLUSTER TO CLOUD
展开▼
机译:从群集到云的移植算法的故障恢复
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method (400) of providing failure recovery capabilities to a cloud environment (10) for scientific HPC applications. An HPC application with MPI implementation extends the class of MPI programs to embed the HPC application with various degrees of fault tolerance. An MPI fault tolerance mechanism realizes a recover-and-continue solution. If an error occurs, only failed processes re-spawn, the remaining living processes remain in their original processors/nodes (12, 14, 16), and system recovery costs are thus minimized.
展开▼