首页>
外国专利>
Single pass space efficient system and method for generating an approximate quantile in a data set having an unknown size
Single pass space efficient system and method for generating an approximate quantile in a data set having an unknown size
展开▼
机译:用于在具有未知大小的数据集中生成近似分位数的单通空间高效系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A space-efficient system and method for generating an approximate &phgr;-quantile data element of a data set in a single pass over the data set, without a priori knowledge of the size of the data set. The approximate &phgr;-quantile is guaranteed to lie within a user-specified approximation error &egr; of the true quantile being sought with a probability of at least 1−&dgr;, with &dgr; being a user-defined probability of failure. B buffers, each having a capacity of k elements, initially are filled with elements from the data set, with the values of b and k depending on approximation error e and the probability &dgr;. The buffers are then collapsed into an output buffer, with the remaining buffers then being refilled with elements, collapsed (along with the previous output buffer), and so on until the entire data set has been processed and a single output remains. The element of the output corresponding to the approximate quantile is then output as the approximate quantile. In later iterations (when the height of the tree is at least equal to a predetermined height that depends on &dgr; and &egr;), the data is sampled non-uniformly to populate the buffers to render the desired performance. Parallel processors can be used, with the final output buffers of the processors being sent to a collecting processor P0 as input buffers to the collecting processor P0.
展开▼