首页> 外国专利> SYSTEMS AND METHODS FOR QUANTILE DETERMINATION IN A DISTRIBUTED DATA SYSTEM USING SAMPLING

SYSTEMS AND METHODS FOR QUANTILE DETERMINATION IN A DISTRIBUTED DATA SYSTEM USING SAMPLING

机译:抽样确定分布式数据系统中的数量的系统和方法

摘要

In accordance with the teachings described herein, systems and methods are provided for estimating or determining quantiles for data stored in a distributed system. In one embodiment, an instruction is received to estimate or determine a specified quantile for a variate in a set of data stored at a plurality of nodes in the distributed system. A plurality of data bins for the variate are defined that are each associated with a different range of data values in the set of data. Lower and upper quantile bounds for each of the plurality of data bins are determined based on the total number of data values that fall within each of the plurality of data bins. The specified quantile is estimated or determined based on an identified one of the plurality of data bins that includes the specified quantile based on the lower and upper quantile bounds.
机译:根据本文描述的教导,提供了用于估计或确定存储在分布式系统中的数据的分位数的系统和方法。在一个实施例中,接收指令以针对存储在分布式系统中的多个节点处的一组数据中的变量来估计或确定指定的分位数。定义了用于变量的多个数据仓,每个数据仓与数据集中的数据值的不同范围相关联。基于落入多个数据箱中的每个数据箱内的数据值的总数,确定多个数据箱中的每个的下限和上限分位数。基于所识别的多个数据仓中的一个来估计或确定指定分位数,该多个数据仓中包括基于下分位数和上限分位数的指定分位数。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号