首页> 外文会议>International Conference on Extending Database Technology >Estimating aggregates in time-constrained approximate queries in Oracle
【24h】

Estimating aggregates in time-constrained approximate queries in Oracle

机译:在Oracle中受时间限制的近似查询中估计聚合

获取原文

摘要

The concept of time-constrained SQL queries was introduced to address the problem of long-running SQL queries. A key approach adopted for supporting time-constrained SQL queries is to use sampling to reduce the amount of data that needs to be processed, thereby allowing completion of the query in the specified time constraint. However, sampling does make the query results approximate and hence requires the system to estimate the values of the expressions (especially aggregates) occurring in the select list. Thus, coming up with estimates for aggregates is crucial for time-constrained approximate SQL queries to be useful, which is the focus of this paper. Specifically, we address the problem of estimating commonly occurring aggregates (namely, SUM, COUNT, AVG, MEDIAN, MIN, and MAX) in time-constrained approximate queries. We give both point and interval estimates for SUM, COUNT, AVG, and MEDIAN using Bernoulli sampling for various type of queries, including join processing with cross product sampling. For MIN (MAX), we give the confidence level that the proportion 100γ% of the population will exceed the MIN (or be less than the MAX) obtained from the sampled data.
机译:引入了时间受限的SQL查询的概念,以解决长时间运行的SQL查询的问题。支持时间受限的SQL查询的一种关键方法是使用采样来减少需要处理的数据量,从而允许在指定的时间约束内完成查询。但是,抽样确实会使查询结果近似,因此需要系统估计选择列表中出现的表达式(尤其是集合)的值。因此,得出合计的估计值对于使时间受约束的近似SQL查询有用是至关重要的,这是本文的重点。具体来说,我们解决了在时间受限的近似查询中估算常见集合(即SUM,COUNT,AVG,MEDIAN,MIN和MAX)的问题。我们使用伯努利抽样针对各种类型的查询(包括联合处理和叉积抽样)提供SUM,COUNT,AVG和MEDIAN的点和区间估计。对于MIN(MAX),我们给出了置信度,即人口的100γ%的比例将超过从采样数据中获得的MIN(或小于MAX)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号