首页>
外国专利>
Method and system for aggregating data in a large data set over a time period using presence bitmaps
Method and system for aggregating data in a large data set over a time period using presence bitmaps
展开▼
机译:使用存在位图在一段时间内聚合大型数据集中的数据的方法和系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system, method, and apparatus are provided for supporting and/or executing count-distinct queries. A large set of data (e.g., tens or hundreds of millions of event records) is condensed daily to generate presence bitmaps to reflect the distinctiveness of a selected data dimension S (e.g., user ID) for one or more key dimensions g1, g2, . . . (e.g., advertisement ID, campaign ID, advertiser ID). The condensation process eliminates duplication and yields a single value (e.g., 1 or 0) for each tuple [S, g1, . . . ] to represent the distinctiveness of each value in the S dimension to each combination of values in the grouping dimensions. On a monthly basis, the daily values are condensed to yield a single value for the month, and a similar process is applied on any other desired time granularities (e.g., year). The condensed data may be generated for any combination of selected dimension(s) and grouping dimension(s).
展开▼