首页> 外国专利> DISTRIBUTED COMPUTATION OF PERCENTILE STATISTICS FOR MULTIDIMENSIONAL DATA SETS

DISTRIBUTED COMPUTATION OF PERCENTILE STATISTICS FOR MULTIDIMENSIONAL DATA SETS

机译:多维数据集的百分比统计量的分布式计算

摘要

The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of partitions containing a set of records, wherein the records include a set of values for a measure and a set of dimensions associated with the values. Next, the system reorganizes the records across the partitions by performing a distributed sort of the records by the measure. For each dimensional subset in the records, the system counts occurrences of the dimensional subset in each of the partitions and groups values of the counted occurrences by the dimensional subset so that the values reside in a single processing node. The system uses the values to identify one or more locations in the partitions for calculating a statistic for the dimensional subset and uses the location(s) to calculate the statistic. Finally, the system outputs the statistic in response to a query containing the dimensional subset.
机译:公开的实施例提供了一种用于处理数据的系统。在操作期间,系统获得包含一组记录的一组分区,其中,记录包括用于度量的一组值和与该值相关联的一组维。接下来,系统通过按度量执行记录的分布式排序来跨分区重新组织记录。对于记录中的每个维子集,系统都会对每个分区中维子集的出现进行计数,并通过维子集将计数的出现次数的值分组,以使这些值驻留在单个处理节点中。系统使用这些值来标识分区中的一个或多个位置,以计算维子集的统计量,并使用位置来计算统计量。最后,系统响应包含维度子集的查询输出统计信息。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号