An ∈-approximate quantile summary of a sequence of N elements is a data structure that can answer quantile queries about the sequence to within a precision of ∈N.
We present a new online algorithm for computing∈-approximate quantile summaries of very large data sequences. The algorithm has a worst-case space requirement of &Ogr;(1÷∈ log(∈N)). This improves upon the previous best result of &Ogr;(1÷∈ log2(∈N)). Moreover, in contrast to earlier deterministic algorithms, our algorithm does not require a priori knowledge of the length of the input sequence.
Finally, the actual space bounds obtained on experimental data are significantly better than the worst case guarantees of our algorithm as well as the observed space requirements of earlier algorithms.
N I>个元素序列的∈近似分位数摘要是一种数据结构,可以在∈ N I>的精度内回答有关该序列的分位数查询。 / P>
我们提出了一种新的在线算法,用于计算非常大的数据序列的ε近似分位数摘要。该算法的最坏情况空间要求为&Ogr; I>(1÷∈log(∈ N I>))。这改进了&Ogr; I>(1÷∈log 2 SUP>(∈ N I>))的先前最佳结果。而且,与早期的确定性算法相比,我们的算法不需要先验知识就可以知道输入序列的长度。 P>
最后,在实验数据上获得的实际空间界限明显优于我们算法的最坏情况保证以及早期算法所观察到的空间要求。 P>
机译:$ O((1 / varepsilon) log(1 / varepsilon))$字中的随机在线分位数摘要
机译:O(1 / epsilon * log(1 / epsilon))词的随机在线分位数摘要
机译:通过Burrows-Wheeler变换对LCP阵列进行空间高效的计算
机译:分位式摘要的空间高效在线计算
机译:在近似贝叶斯计算中优化摘要统计的使用。
机译:竞争性在线分位数回归
机译:在在线计算定量位摘要,但不确定的数据流