首页> 外国专利> Approximate distinct counting in a bounded memory

Approximate distinct counting in a bounded memory

机译:有限内存中的近似非重复计数

摘要

A table is processed to determine an approximate NDV for a plurality of groups. For each row, a group based is identified based on one or more group-by columns. A hashed valued is generated by applying a uniform hash function to a value in an NDV column. The hashed value is assigned to a particular bucket based on the values at a first set of bit positions in a binary representation of the hashed value. A bit position value is determined based on for a remaining portion of the binary representation of the hashed value. The bit position value is based on a number of ordered bits in the hashed value that match a particular bit pattern. For each group identified, a maximum bit position (MBP) table is generated. The MBP table stores, for one or more buckets, the maximum bit position value determined for hashed values assigned to a particular bucket.
机译:处理表以确定多个组的近似NDV。对于每一行,基于一个或多个“分组依据”列来标识“基于分组”。通过将统一的哈希函数应用于NDV列中的值来生成哈希值。基于散列值的二进制表示中的第一组比特位置处的值,将散列值分配给特定桶。基于散列值的二进制表示的剩余部分来确定位位置值。位位置值基于散列值中与特定位模式匹配的许多有序位。对于每个识别出的组,都会生成一个最大位位置(MBP)表。 MBP表为一个或多个存储桶存储为分配给特定存储桶的哈希值确定的最大位位置值。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号