首页> 外文OA文献 >Partial fault tolerance in stream processing applications - methods and evaluation techniques
【2h】

Partial fault tolerance in stream processing applications - methods and evaluation techniques

机译:流处理应用中的部分容错 - 方法和评估技术

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Stream processing emerged as a paradigm to continuously process incoming live data streams, such as audio, video, and business feeds. These applications are assembled as dataflow graphs, where each vertex of the graph is a stream operator and each edge is a stream connection. In this environment, a fault in a stream operator can result in massive data loss or in the generation of inaccurate results. Most of the fault tolerance solutions proposed for streaming applications aim at guaranteeing that no data is lost or that no data item is delivered to the application more than once. These techniques result in high performance overhead, given the need to coordinate the state stored in checkpoints of distributed components or maintain consistency between replicas. In this dissertation, we investigate partial fault tolerance methods, which protect only the most critical stream operators of a streaming application. These methods take advantage of the fact that stream processing algorithms are approximate by nature and, as a result, can still achieve acceptable results under data loss and duplicate data delivery. The methods proposed in this dissertation include a checkpoint-based mechanism and a partial graph replication technique. Both techniques were implemented in System S, IBM Research's stream processing middleware. In addition, this dissertation describes two different fault tolerance evaluation techniques. The first technique is based on fault injection and is used to emulate the effects of partial fault tolerance on a streaming application. With the fault injection results, the developers can understand the impact of faults on the application output and identify the most critical operators on their streaming application. The second evaluation technique is a model-based framework which provides generic abstractions for representing streaming applications with the stochastic activity network formalism. The framework allows the comparison of different fault tolerance techniques under varying fault models. Based on the results, the developers can evaluate the trade-offs that a certain technique provides when applied to their target application.
机译:流处理作为一种范例可以不断处理传入的实时数据流,例如音频,视频和业务提要。这些应用程序被组装为数据流图,其中图的每个顶点是一个流运算符,每个边都是一个流连接。在这种环境下,流运算符的故障可能会导致大量数据丢失或产生不准确的结果。为流应用程序提出的大多数容错解决方案旨在确保不丢失任何数据或不向应用程序多次传递任何数据项。考虑到需要协调存储在分布式组件检查点中的状态或保持副本之间的一致性,这些技术会导致高性能开销。在本文中,我们研究了部分容错方法,该方法仅保护流应用程序中最关键的流运算符。这些方法利用了这样的事实,即流处理算法本质上是近似的,因此,在数据丢失和重复数据传递的情况下,仍然可以达到可接受的结果。本文提出的方法包括基于检查点的机制和部分图复制技术。两种技术都在IBM Research的流处理中间件System S中实现。此外,本文还介绍了两种不同的容错评估技术。第一种技术基于故障注入,用于模拟部分故障容错对流应用程序的影响。利用故障注入结果,开发人员可以了解故障对应用程序输出的影响,并确定其流式应用程序中最关键的运算符。第二种评估技术是基于模型的框架,该框架提供了通用抽象,用于以随机活动网络形式表示流应用程序。该框架允许在变化的故障模型下比较不同的容错技术。根据结果​​,开发人员可以评估将某种技术应用于其目标应用程序时所要进行的权衡。

著录项

  • 作者

    Jacques da Silva Gabriela;

  • 作者单位
  • 年度 2010
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号