首页> 外文期刊>Software Quality Journal >A fault tolerant election-based deadlock detection algorithm in distributed systems
【24h】

A fault tolerant election-based deadlock detection algorithm in distributed systems

机译:分布式系统中基于容错选举的死锁检测算法

获取原文
获取原文并翻译 | 示例
           

摘要

Deadlock detection in a distributed system without shared memory is important to ensure the reliability of the system. It becomes more complex when multiple deadlock detection algorithm instances execute concurrently in the system. In addition, the problem of communication disconnection between computing nodes or processes makes deadlock detection more difficult. Existing centralized algorithms suffer from single point failure of the central controller (due to communication disconnection), and they are performance-inefficient in the case of concurrent execution. In this paper, we extend our previous work (Lu et al. 2016) and propose a fault tolerant deadlock detection algorithm in distributed systems. The extended proposed algorithm can tolerate a certain extent of communication disconnection between computing nodes or processes. A central controller is used to collect requesting conditions, construct a wait-for graph, and detect deadlocks. The proposed algorithm can select a new central controller if the current central leader fails due to communication disconnections. The liveness and safety properties of the proposed algorithm are proved in this paper. Experimental results show that the proposed algorithm provides better performance than most of existing algorithms in terms of message number, data traffic, and execution time. In addition, the proposed algorithm provides additional fault tolerance compared to existing deadlock detection algorithms in the case of communication disconnection.
机译:没有共享内存的分布式系统中的死锁检测对于确保系统的可靠性很重要。当多个死锁检测算法实例在系统中同时执行时,它将变得更加复杂。另外,计算节点或进程之间的通信断开连接问题使死锁检测更加困难。现有的集中式算法遭受中央控制器的单点故障(由于通信断开),并且在并发执行的情况下性能低下。在本文中,我们扩展了之前的工作(Lu et al.2016),并提出了分布式系统中的容错死锁检测算法。提出的扩展算法可以容忍计算节点或进程之间一定程度的通信断开。中央控制器用于收集请求条件,构建等待图和检测死锁。如果当前中央领导者由于通信断开而失败,则提出的算法可以选择新的中央控制器。文中证明了该算法的有效性和安全性。实验结果表明,在消息数量,数据流量和执行时间方面,该算法比大多数现有算法具有更好的性能。此外,与现有的死锁检测算法相比,在通信断开的情况下,所提出的算法具有更高的容错能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号