首页> 外文会议>Proceedings of the Twenty-third international conference on very large data bases >A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data
【24h】

A One-Pass Algorithm for Accurately Estimating Quantiles for Disk-Resident Data

机译:用于精确估计磁盘驻留数据的分位数的单程算法

获取原文
获取原文并翻译 | 示例

摘要

The ψ-quantile of an ordered sequence of data values is the element with rank ψn, where n is the total number of values. Accurate estimates of quantiles are required for the solution of many practical problems. In this paper, we present a new algorithm for estimating the quantile values for disk-resident data. Our algorithm has the following characteristics: (1) It requires only one pass over the data; (2) It is deterministic; (3) It produces good lower and upper bounds of the true values of the quantiles; (4) It requires no a priori knowledge of the distribution of the data set; (5) It has a scalable parallel formulation; (6) Extra time and memory for computing additional quantiles (beyond the first one) are constant per quantile.rnWe present experimental results on the IBM SP-2. The experimental results show that the algorithm is indeed robust and does not depend on the distribution of the data sets.
机译:数据值的有序序列的ψ分位数是等级ψn的元素,其中n是值的总数。解决许多实际问题需要精确的分位数估算。在本文中,我们提出了一种用于估计磁盘驻留数据的分位数的新算法。我们的算法具有以下特点:(1)仅需对数据进行一次传递; (2)是确定性的; (3)产生分位数真实值的良好上下限; (4)它不需要有关数据集分布的先验知识; (5)具有可扩展的并行公式; (6)计算每个分位数所需要的额外时间和内存(超出第一个分位数)是恒定的。我们在IBM SP-2上展示了实验结果。实验结果表明,该算法确实是健壮的,并且不依赖于数据集的分布。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号