首页>
外国专利>
DISTRIBUTED COMPUTATION OF PERCENTILE STATISTICS FOR MULTIDIMENSIONAL DATA SETS
DISTRIBUTED COMPUTATION OF PERCENTILE STATISTICS FOR MULTIDIMENSIONAL DATA SETS
展开▼
机译:多维数据集的百分比统计量的分布式计算
展开▼
页面导航
摘要
著录项
相似文献
摘要
The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of partitions containing a set of records, wherein the records include a set of values for a measure and a set of dimensions associated with the values. Next, the system reorganizes the records across the partitions by performing a distributed sort of the records by the measure. For each dimensional subset in the records, the system counts occurrences of the dimensional subset in each of the partitions and groups values of the counted occurrences by the dimensional subset so that the values reside in a single processing node. The system uses the values to identify one or more locations in the partitions for calculating a statistic for the dimensional subset and uses the location(s) to calculate the statistic. Finally, the system outputs the statistic in response to a query containing the dimensional subset.
展开▼