首页> 外文期刊>Journal of Functional Programming >A programming model and foundation for lineage-based distributed computation
【24h】

A programming model and foundation for lineage-based distributed computation

机译:基于谱系的分布式计算的编程模型和基础

获取原文
获取原文并翻译 | 示例

摘要

The most successful systems for "big data" processing have all adopted functional APIs. We present a new programming model, we call function passing, designed to provide a more principled substrate, or middleware, upon which to build data-centric distributed systems like Spark. A key idea is to build up a persistent functional data structure representing transformations on distributed immutable data by passing well-typed serializable functions over the wire and applying them to this distributed data. Thus, the function passing model can be thought of as a persistent functional data structure that is distributed, where transformations performed on distributed data are stored in its nodes rather than the distributed data itself. One advantage of this model is that failure recovery is simplified by design - data can be recovered by replaying function applications atop immutable data loaded from stable storage. Deferred evaluation is also central to our model; by incorporating deferred evaluation into our design only at the point of initiating network communication, the function passing model remains easy to reason about while remaining efficient in time and memory. Moreover, we provide a complete formalization of the programming model in order to study the foundations of lineage-based distributed computation. In particular, we develop a theory of safe, mobile lineages based on a subject reduction theorem for a typed core language. Furthermore, we formalize a progress theorem that guarantees the finite materialization of remote, lineage-based data. Thus, the formal model may serve as a basis for further developments of the theory of data-centric distributed programming, including aspects such as fault tolerance. We provide an open-source implementation of our model in and for the Scala programming language, along with a case study of several example frameworks and end-user programs written atop this model.
机译:用于“大数据”处理的最成功的系统都采用了功能性API。我们提出了一种新的编程模型,我们称之为函数传递,旨在提供一种更原则的基础或中间件,在其上构建以数据为中心的分布式系统(例如Spark)。一个关键的想法是建立一个持久的功能数据结构,该结构通过在网络上传递良好类型的可序列化函数并将其应用于此分布式数据来表示分布式不变数据上的转换。因此,功能传递模型可以认为是分布式的持久性功能数据结构,其中对分布式数据执行的转换存储在其节点中,而不是分布式数据本身中。该模型的一个优势是通过设计简化了故障恢复-可以通过在稳定存储加载的不可变数据之上重播功能应用程序来恢复数据。递延评估也是我们模型的核心。通过仅在启动网络通信时将延迟评估结合到我们的设计中,功能传递模型仍易于推理,同时仍保持高效的时间和内存。此外,我们提供了编程模型的完整形式化信息,以研究基于谱系的分布式计算的基础。特别是,我们基于一种类型化核心语言的主题约简定理,开发了一种安全的移动谱系理论。此外,我们对进度定理进行形式化,以保证基于世系的远程数据的有限实现。因此,形式模型可以作为以数据为中心的分布式编程理论进一步发展的基础,包括诸如容错之类的方面。我们使用Scala编程语言以及针对Scala编程语言提供了我们模型的开源实现,以及在此模型上编写的几个示例框架和最终用户程序的案例研究。

著录项

  • 来源
    《Journal of Functional Programming》 |2018年第2018期|e7.1-e7.48|共48页
  • 作者单位

    KTH Royal Inst Technol, Sch Elect Engn & Comp Sci, SE-10044 Stockholm, Sweden;

    Ecole Polytech Fed Lausanne, Sch Comp & Commun Sci, CH-1015 Lausanne, Switzerland;

    Safeplace, DE-40667 Meerbusch, Germany;

  • 收录信息 美国《科学引文索引》(SCI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号