【24h】

Ripple Joins for Online Aggregation

机译:纹波加入在线聚合

获取原文

摘要

We present a new family of join algorithms, called ripple joins, for online processing of multi-table aggregation queries in a relational database management system (DBMS). Such queries arise naturally in interactive exploratory decision-support applications. Traditional offline join algorithms are designed to minimize the time to completion of the query. In contrast, ripple joins are designed to minimize the time until an acceptably precise estimate of the query result is available, as measured by the length of a confidence interval. Ripple joins are adaptive, adjusting their behavior during processing in accordance with the statistical properties of the data. Ripple joins also permit the user to dynamically trade off the two key performance factors of online aggregation: the time between successive updates of the running aggregate, and the amount by which the confidence-interval length decreases at each update. We show how ripple joins can be implemented in an existing DBMS using iterators, and we give an overview of the methods used to compute confidence intervals and to adaptively optimize the ripple join "aspect-ratio" parameters. In experiments with an initial implementation of our algorithms in the postgres DBMS, the time required to produce reasonably precise online estimates was up to two orders of magnitude smaller than the time required for the best offline join algorithms to produce exact answers.
机译:我们展示了一个名为纹波连接的新的加入算法系列,用于在关系数据库管理系统(DBMS)中的多表聚合查询的在线处理。此类查询自然出现在交互式探索决策支持应用程序中。传统的离线连接算法旨在最大限度地减少查询完成的时间。相反,纹波连接被设计为最小化在查询结果的可接受精确估计可用的时间直到可以通过置信区间的长度测量。纹波加入是自适应的,根据数据的统计属性在处理期间调整其行为。纹波加入还允许用户动态地互换在线聚合的两个关键性能因素:运行聚合的连续更新之间的时间,以及每个更新时置信间隔长度减小的量。我们展示了如何使用迭代器在现有DBM中实现纹波连接,我们概述了用于计算置信区间的方法,并自适应地优化纹波连接“纵横比”参数。在实验中,在Postgres DBMS中初始实施我们的算法中,产生合理精确的在线估计所需的时间达到比最佳离线连接算法所需的时间小的两个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号