首页> 外文会议>2013 IEEE 22nd International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises >A Delayed Checkpoint Approach for Communication-Induced Checkpointing in Autonomic Computing
【24h】

A Delayed Checkpoint Approach for Communication-Induced Checkpointing in Autonomic Computing

机译:自主计算中通信诱导检查点的延迟检查点方法

获取原文
获取原文并翻译 | 示例

摘要

Although the initiative of Autonomic Computing was introduced a dozen years ago, several challenges remain open. One of these challenges is the efficient monitoring at runtime oriented to the detection, diagnosis, and repair of problems that result from failures or bugs in software and/or hardware components. For this purpose, Communication-induced Checkpointing (CIC) can be a useful tool. Communication-induced Checkpointing has been used to attack a wide range of problems that arise in distributed systems, such as rollback recovery, software debugging and software verification, among others. In CIC algorithms, an autonomic component (process) asynchronously cooperates by exchanging information on the application messages about saved local states called checkpoints. CIC aims to form global consistent snapshots by grouping checkpoints (one by each component) in a non-coordinated way. To achieve this, CIC solutions continuously monitor the exchanged control information to identify possible dangerous checkpointing patterns. When a dangerous pattern is identified, it is broken by locally triggering a forced checkpoint. Nevertheless, as we will show, not all forced checkpoints triggered by current solutions are necessary. In this paper, we present a delayed checkpoint approach suitable for autonomic computing that reduces forced checkpoints by establishing certain triggering rules that we call safe checkpoint conditions. Finally, some results are presented which show that our proposal is more efficient than other current solutions.
机译:尽管自主计算的倡议是在12年前提出的,但仍然存在一些挑战。这些挑战之一是针对由软件和/或硬件组件的故障或错误导致的问题的检测,诊断和修复,在运行时进行有效监视。为此,通信诱导检查点(CIC)可能是有用的工具。通信引起的检查点已用于解决分布式系统中出现的各种问题,例如回滚恢复,软件调试和软件验证等。在CIC算法中,自治组件(进程)通过交换有关已保存的称为检查点的本地状态的应用程序消息上的信息来异步协作。 CIC旨在通过以非协调方式将检查点分组(每个组件一个)来形成全局一致的快照。为此,CIC解决方案不断监视交换的控制信息,以识别可能的危险检查点模式。当识别出危险模式时,通过在本地触发强制检查点来将其破坏。但是,正如我们将要显示的那样,并非当前解决方案触发的所有强制检查点都是必需的。在本文中,我们提出了一种适用于自主计算的延迟检查点方法,该方法通过建立某些称为安全检查点条件的触发规则来减少强制检查点。最后,一些结果表明,我们的建议比其他当前解决方案更有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号