首页> 外文会议>IEEE International Conference on Data Engineering >An Iterative Scheme for Leverage-Based Approximate Aggregation
【24h】

An Iterative Scheme for Leverage-Based Approximate Aggregation

机译:基于杠杆的近似聚合的迭代方案

获取原文

摘要

The current data explosion poses great challenges to approximate aggregation with high efficiency and accuracy. To address this problem, we propose a novel approach to calculate the aggregation answers with a high accuracy using only a small portion of the data. We introduce leverages to reflect individual differences in the data from a statistical perspective. Two kinds of estimators, the leverage-based estimator, and the sketch estimator (a "rough picture" of the aggregation answer), are in constraint relations and iteratively improved according to the actual conditions until their difference is below a threshold. Due to the iteration mechanism and the leverages, our approach achieves a high accuracy. Moreover, some features, such as not requiring recording the sampled data and easy to extend to various execution modes, such as the online mode, make our approach well suited to deal with big data. Experiments show that our approach has an extraordinary performance, and when compared with the uniform sampling, our approach can achieve high-quality answers with only 1/3 sample size.
机译:当前的数据爆炸给以高效和高精度进行近似聚合提出了巨大的挑战。为了解决这个问题,我们提出了一种新颖的方法,可以仅使用一小部分数据以高精度计算聚合答案。我们引入了一种杠杆作用,以从统计角度反映数据中的个体差异。两种估算器(基于杠杆的估算器和草图估算器)(聚合答案的“粗略图片”)处于约束关系中,并根据实际条件进行迭代改进,直到它们的差值低于阈值为止。由于迭代机制和杠杆作用,我们的方法实现了高精度。此外,某些功能(例如不需要记录采样数据并且易于扩展到各种执行模式,例如在线模式)使我们的方法非常适合处理大数据。实验表明,我们的方法具有非凡的性能,与统一采样相比,我们的方法仅用1/3的样本量即可获得高质量的答案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号