首页> 外文OA文献 >Tracing fine-grained provenance in stream processing systemsusing a reverse mapping method
【2h】

Tracing fine-grained provenance in stream processing systemsusing a reverse mapping method

机译:跟踪流处理系统中的细粒度来源使用反向映射方法

摘要

Applications that require continuous processing of high-volume data streams have grown in prevalence and importance. These kinds of system often process streaming data in real-time or near real-time and provide instantaneous responses in order to support a precise and on time decision. In such systems it is difficult to know exactly how a particular result is generated. However, such information is extremely important for the validation and veri?cation of stream processing results. Therefore, it is crucial that stream processing systems have a mechanism for tracking provenance - the information pertaining to the process that produced result data - at the level of individual stream elements which we refer to as fine-grained provenance tracking for streams. The traceability of stream processing systems allows for users to validate individual stream elements, to verify the computation that took place and to understand the chain of reasoning that was used in the production of a stream processing result. Several recent solutions to provenance tracking in stream processing systems mainly focus on coarse-grained stream provenance in which the level of granularity for capturing provenance information is not detailed enough to address our problem. This thesis proposes a novel fine-grained provenance solution for streams that exploits a reverse mapping method to precisely capture dependency relationships for every individual stream element. It is also designed to support a stream-specific provenance query mechanism, which performs provenance queries dynamically over streams of provenance assertions without requiring the assertions to be stored persistently. The dissertation makes four major contributions to the state of the art. First is a provenance model for streams that allows for the provenance of individual stream elements to be obtained. Second is a provenance query method which utilizes a reverse mapping method - stream ancestor functions - in order to obtain the provenance of a particular stream processing result. The third contribution is a stream-specific provenance query mechanism that enables provenance queries to be computed on-the-fly without requiring provenance assertions to be stored persistently. The fourth contribution is the performance characteristics of our stream provenance solution. It is shown that the storage overhead for provenance collection can be reduced significantly by using our storage reduction technique and the marginal cost of storage consumption is constant based on the number of input stream events. A 4% overhead for the persistent provenance approach and a 7% overhead for the stream-specific query approach are observed as the impact of provenance recording on system performance. In addition, our stream-specific query approach offers low latency processing (0.3 ms per additional component) with reasonable memory consumption.
机译:需要连续处理大量数据流的应用程序日益普及和重要。这些类型的系统通常实时或近乎实时地处理流数据并提供即时响应,以支持准确和及时的决策。在这样的系统中,很难确切地知道如何产生特定的结果。但是,此类信息对于流处理结果的确认和验证非常重要。因此,至关重要的是,流处理系统必须具有一种机制,可以在单个流元素的级别上跟踪溯源(与生成结果数据的过程相关的信息),我们称之为流的细粒度溯源。流处理系统的可追溯性允许用户验证各个流元素,验证所进行的计算并了解在流处理结果的产生中使用的推理链。流处理系统中源跟踪的几种最新解决方案主要集中在粗粒度流源,其中捕获源信息的粒度级别不够详细,无法解决我们的问题。本文提出了一种新颖的流细粒度源解决方案,该解决方案利用反向映射方法来精确捕获每个流元素的依赖关系。它还设计为支持特定于流的出处查询机制,该机制可在出处断言流上动态执行出处查询,而无需持久存储断言。论文对现有技术做出了四项主要贡献。首先是流的出处模型,该模型允许获取各个流元素的出处。第二种是一种来源查询方法,该方法利用反向映射方法(流祖先函数)来获得特定流处理结果的来源。第三个贡献是特定于流的出处查询机制,该机制使得可以即时计算出处查询,而无需持久存储出处断言。第四个贡献是我们流源解决方案的性能特征。结果表明,通过使用我们的存储减少技术,可以显着减少出处收集的存储开销,并且基于输入流事件的数量,存储消耗的边际成本是恒定的。持久性源方法的开销为4%,流特定查询方法的开销为7%,这是由于源记录对系统性能的影响。此外,我们的流特定查询方法提供了低延迟处理(每个附加组件0.3毫秒),并具有合理的内存消耗。

著录项

  • 作者

    Sansrimahachai Watsawee;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号