首页> 中文期刊>江苏科技信息 >集群系统分布式任务故障冗余管理机制的设计与实现

集群系统分布式任务故障冗余管理机制的设计与实现

     

摘要

At present,the probability of task failure is increasing in cluster system due to the number of nodes and computing tasks scale growing,design and implementation of distributed task fault redundancy for cluster system is to solve the above problems. This paper firstly introduces the architecture of distributed task fault redundancy management for cluster system;then the author describes the key technologies such as task failure detection and recovery,the solution of the problem of single point of the cluster,the synchronization of tasks status and the user interaction;finally,the laboratory environment test shows that this mechanism can enhance the reliability of the cluster system and the stable operation of distributed tasks in cluster system.%目前集群系统因节点数量和计算任务规模不断增长,导致任务故障概率不断增加,设计和实现集群系统分布式任务故障冗余机制就是为了解决集群系统的上述问题。文章首先介绍了集群系统分布式任务故障冗余管理的体系架构;然后详细阐述了任务故障检测及恢复、集群单点故障问题的解决、任务状态同步等关键技术;最后,通过实验室环境测试进一步表明,该机制能够增强集群系统运行的可靠性,保障集群系统分布式任务的稳定运行。

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号