首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Multistep interactive convergence: an efficient approach to the fault-tolerant clock synchronization of large multicomputers
【24h】

Multistep interactive convergence: an efficient approach to the fault-tolerant clock synchronization of large multicomputers

机译:多步交互式收敛:大型多计算机的容错时钟同步的有效方法

获取原文
获取原文并翻译 | 示例

摘要

We present a new approach for fault-tolerant internal clock synchronization in multicomputer systems employing not completely connected networks (NCCNs). The approach is referred to as multistep interactive convergence and is locally implemented in each multicomputer node by a time server process (TSP). We describe a specific algorithm that uses multistep interactive convergence and bases its operation on a logical mapping of the system's TSPs into an m-dimensional array. A TSP executes m steps per round of synchronization, with each step including a call to an interactive convergence procedure. For any TSP, clock readings in step i are gathered only from TSPs with which it shares a row along dimension i of the array. Hence, a TSP reads clocks only from a small subset of the TSPs in the system, which reduces the number of messages by orders of magnitude over a conventional interactive convergence algorithm in which reliable all-to-all broadcast of clock values is done. The algorithm can be used in systems of arbitrary topology and provides the added benefit of increased locality of communication in regular NCCNs such as hypercubes and tori. These advantages can be combined with a variety of message staggering mechanisms to maintain network contention at a minimum. We present expressions for the maximum clock skew, maximum clock drift, maximum clock discontinuity, and number of messages produced by the algorithm, and show that it tolerates arbitrary faults. A comparison with other algorithms that elucidates the advantages of multistep interactive convergence is also provided.
机译:我们提出了一种新方法,用于采用不完全连接的网络(NCCN)的多计算机系统中的容错内部时钟同步。该方法称为多步骤交互式收敛,并通过时间服务器进程(TSP)在每个多计算机节点中本地实现。我们描述了一种使用多步交互式收敛并将其操作基于系统TSP到m维数组的逻辑映射的特定算法。 TSP每轮同步执行m个步骤,每个步骤都包括对交互式收敛过程的调用。对于任何TSP,步骤i中的时钟读数仅从与其共享沿阵列维度i行的TSP收集。因此,TSP仅从系统中TSP的一小部分中读取时钟,与传统的交互式会聚算法相比,它可将消息数量减少几个数量级,而在传统的交互式会聚算法中,时钟值进行了可靠的所有广播。该算法可用于任意拓扑结构的系统中,并具有增加常规NCCN(例如超立方体和花托)中通信局部性的好处。这些优点可以与各种消息交错机制组合在一起,以将网络争用保持在最低水平。我们给出了最大时钟偏斜,最大时钟漂移,最大时钟不连续性以及该算法产生的消息数量的表达式,并显示了它可以容忍任意故障。还提供了与其他算法的比较,阐明了多步交互式收敛的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号