首页> 外文会议>International Symposium on Advanced Parallel Processing Technologies >Self-adaptive Failure Detector for Peer-to-Peer Distributed System Considering the Link Faults
【24h】

Self-adaptive Failure Detector for Peer-to-Peer Distributed System Considering the Link Faults

机译:考虑链路故障的对等分布式系统的自适应故障检测器

获取原文
获取外文期刊封面目录资料

摘要

Nowadays, the distributed computing is prevailing in artificial intelligence applications due to the limited computation capacity of single computing node. Generally, distributed computing system contains large scale of computing node, and therefore system breakdown is regarded as usual matter. To enhance the system availability and performance, failure detection dominates important status to recover the system. The traditional failure detector simply equates the link fault with the node fault problem, which greatly affects the resource utilization, fault locating and fast repair. We present a self-adaptive Link-based Failure Detection Agreement DLFDA with an improved node fault detection algorithm, which can accurately distinguish the node fault and link fault. DLFDA can dynamically adjust the detection structure to increase the coverage of the link fault detection, while using Gossip protocol to distribute fault diagnosis results to other system members, which extensively reduces the damage of the system performance. Finally, the experimental results show that our method can meet the requirements of theoretical design.
机译:如今,由于单个计算节点的有限计算能力,分布式计算在人工智能应用中普遍存在。通常,分布式计算系统包含大规模的计算节点,因此系统崩溃被视为常用物质。为提高系统可用性和性能,故障检测占据了恢复系统的重要状态。传统的故障检测器简单地等同于带有节点故障问题的链路故障,这极大地影响了资源利用率,故障定位和快速修复。我们提出了一种基于自适应链接的失效检测协议DLFDA,具有改进的节点故障检测算法,可以准确地区分节点故障和链路故障。 DLFDA可以动态调整检测结构以增加链路故障检测的覆盖范围,同时使用Gossip协议将故障诊断结果分发到其他系统成员,这会广泛降低系统性能的损坏。最后,实验结果表明,我们的方法可以满足理论设计的要求。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号