...
首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Incremental Query Processing on Big Data Streams
【24h】

Incremental Query Processing on Big Data Streams

机译:大数据流上的增量查询处理

获取原文
获取原文并翻译 | 示例
           

摘要

This paper addresses online query processing for large-scale, incremental data analysis on a distributed stream processing engine (DSPE). Our goal is to convert any SQL-like query to an incremental DSPE program automatically. In contrast to other approaches, we derive incremental programs that return accurate results, not approximate answers, by retaining a minimal state during the query evaluation lifetime and by using a novel incremental evaluation technique, which, at each time interval, returns an accurate snapshot answer that depends on the current state and the latest batches of data. Our methods can handle many forms of queries on nested data collections, including iterative and nested queries, group-by with aggregation, and equi-joins. Finally, we report on a prototype implementation of our framework, called MRQL Streaming, running on top of Spark and we experimentally validate the effectiveness of our methods.
机译:本文介绍了用于分布式流处理引擎(DSPE)上的大规模增量数据分析的在线查询处理。我们的目标是将任何类似SQL的查询自动转换为增量DSPE程序。与其他方法相比,我们通过在查询评估生命周期内保持最小状态并使用新颖的增量评估技术(在每个时间间隔返回准确的快照答案),得出返回准确结果而不是近似答案的增量程序。这取决于当前状态和最新批次的数据。我们的方法可以处理对嵌套数据集合的多种形式的查询,包括迭代和嵌套查询,具有聚合的分组依据和等联接。最后,我们报告了在Spark之上运行的称为MRQL流的框架原型实现,并通过实验验证了方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号