首页> 外文会议>IEEE international conference on data engineering >Data stream partitioning re-optimization based on runtime dependency mining
【24h】

Data stream partitioning re-optimization based on runtime dependency mining

机译:基于运行时依赖挖掘的数据流分区重新优化

获取原文

摘要

In distributed data stream processing, a program made of multiple queries can be parallelized by partitioning input streams according to the values of specific attributes, or partitioning keys. Applying different partitioning keys to different queries requires re-partitioning intermediary streams, causing extra communication and reduced throughput. Re-partitionings can be avoided by detecting dependencies between the partitioning keys applicable to each query. Existing partitioning optimization methods analyze query syntax at compile-time to detect inter-key dependencies and avoid re-partitionings. This paper extends those compile-time methods by adding a runtime re-optimization step based on the mining of temporal approximate dependencies (TADs) between partitioning keys. A TAD is defined in this paper as a type of dependency that can be approximately valid over a moving time window. Our evaluation, based on a simulation of the Linear Road Benchmark, showed a 94.5% reduction of the extra communication cost.
机译:在分布式数据流处理,由多个查询的程序可以通过分割输入根据特定属性,或分区键的值并行流。运用不同的分区键不同的查询需要重新划分中介流,导致额外的通信,降低产量。重新分区成为可以通过检测适用于每个查询分区键之间的相关性来避免。现有的分区优化方法分析查询语法在编译时检测间键依赖和避免重新分区成为。本文通过添加基于分区键之间的时间近似依赖(TAD的)的采矿的运行时重新优化步骤延伸那些编译时的方法。甲TAD在本文定义为类型依赖的,可以是在移动时间窗近似有效。我们的评估,基于线性路基准的模拟,显示出额外的通信成本的下降94.5%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号