【24h】

Fault Tolerance via Idempotence

机译:通过幂等的容错

获取原文

摘要

Building distributed services and applications is challenging due to the pitfalls of distribution such as process and communication failures. A natural solution to these problems is to detect potential failures, and retry the failed computation and/or resend messages. Ensuring correctness in such an environment requires distributed services and applications to be idempotent. In this paper, we study the inter-related aspects of process failures, duplicate messages, and idempotence. We first introduce a simple core language (based on λ-calculus) inspired by modern distributed computing platforms. This language formalizes the notions of a service, duplicate requests, process failures, data partitioning, and local atomic transactions that are restricted to a single store. We then formalize a desired (generic) correctness criterion for applications written in this language, consisting of idempotence (which captures the desired safety properties) and failure-freedom (which captures the desired progress properties). We then propose language support in the form of a monad that automatically ensures failfree idempotence. A key characteristic of our implementation is that it is decentralized and does not require distributed coordination. We show that the language support can be enriched with other useful constructs, such as compensations, while retaining the coordination-free decentralized nature of the implementation. We have implemented the idempotence monad (and its variants) in F# and C# and used our implementation to build realistic applications on Windows Azure. We find that the monad has low runtime overheads and leads to more declarative applications.
机译:由于诸如过程和通信故障之类的分布陷阱,构建分布式服务和应用程序具有挑战性。这些问题的自然解决方案是检测潜在故障,然后重试失败的计算和/或重新发送消息。在这样的环境中确保正确性要求分布式服务和应用程序是幂等的。在本文中,我们研究了过程故障,重复消息和幂等的相互关联的方面。我们首先介绍一种受现代分布式计算平台启发的简单核心语言(基于λ微积分)。这种语言形式化了服务,重复请求,流程故障,数据分区和仅限于单个存储的本地原子事务的概念。然后,我们针对用这种语言编写的应用程序形式化一个期望的(通用)正确性标准,包括幂等性(捕获所需的安全特性)和无故障性(捕获所需的进度特性)。然后,我们以monad的形式提出语言支持,以自动确保无故障幂等。我们实施的关键特征在于它是分散的,不需要分布式协调。我们表明,语言支持可以通过其他有用的构造(例如补偿)来丰富,同时保留实施的无协调分散性。我们已经在F#和C#中实现了幂等式monad(及其变体),并使用我们的实现在Windows Azure上构建了实际的应用程序。我们发现monad具有较低的运行时开销,并导致更具声明性的应用程序。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号