首页> 外文会议>IEEE International Conference on Data Engineering >An Iterative Scheme for Leverage-Based Approximate Aggregation
【24h】

An Iterative Scheme for Leverage-Based Approximate Aggregation

机译:一种基于杠杆的近似聚集的迭代方案

获取原文

摘要

The current data explosion poses great challenges to approximate aggregation with high efficiency and accuracy. To address this problem, we propose a novel approach to calculate the aggregation answers with a high accuracy using only a small portion of the data. We introduce leverages to reflect individual differences in the data from a statistical perspective. Two kinds of estimators, the leverage-based estimator, and the sketch estimator (a "rough picture" of the aggregation answer), are in constraint relations and iteratively improved according to the actual conditions until their difference is below a threshold. Due to the iteration mechanism and the leverages, our approach achieves a high accuracy. Moreover, some features, such as not requiring recording the sampled data and easy to extend to various execution modes, such as the online mode, make our approach well suited to deal with big data. Experiments show that our approach has an extraordinary performance, and when compared with the uniform sampling, our approach can achieve high-quality answers with only 1/3 sample size.
机译:目前的数据爆炸带来了高效率和准确性的近似聚集挑战。为了解决这个问题,我们提出了一种新的方法来计算具有高精度的聚合答案,只需一小部分数据。我们介绍了利用统计角度反映了数据的个人差异。两种估计器,基于杠杆的估计器和素描估计器(汇总答案的“粗糙图像”),在约束关系中,根据实际条件迭代地改善,直到它们的差异低于阈值。由于迭代机制和杠杆,我们的方法实现了高精度。此外,一些特征,例如不需要记录采样数据,易于扩展到各种执行模式,例如在线模式,使我们的方法适合处理大数据。实验表明,我们的方法具有非凡的性能,与统一的采样相比,我们的方法可以实现高质量的答案,只有1/3样本大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号