首页> 外文会议>Data Compression Conference >A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time
【24h】

A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time

机译:杜鹃散列变体,内存利用率提高和插入时间

获取原文

摘要

Cuckoo hashing [4] is a multiple choice hashing scheme in which each item can be placed in multiple locations, and collisions are resolved by moving items to their alternative locations. In the classical implementation of two-way cuckoo hashing, the memory is partitioned into contiguous disjoint xed-size buckets. Each item is hashed to two buckets, and may be stored in any of the positions within those buckets. Ref. [2] analyzed a variation in which the buckets are contiguous and overlap. However, many systems retrieve data from secondary storage in same-size blocks called pages. Fetching a page is a relatively expensive process, but once a page is fetched, its contents can be accessed orders of magnitude faster. We utilize this property of memory retrieval, presenting a variant of cuckoo hashing incorporating the following constraint: each bucket must be fully contained in a single page, but buckets are not necessarily contiguous. Empirical results show that this modification increases memory utilization and decreases the number of iterations required to insert an item. If each item is hashed to two buckets of capacity two, the page size is 8, and each bucket is fully contained in a single page, the memory utilization equals 89.71% in the classical contiguous disjoint bucket variant, 93.78% in the contiguous overlapping bucket variant, and increases to 97.46% in our new non-contiguous bucket variant. When the memory utilization is 92% and we use breadth rest search to look for a vacant position, the number of iterations required to insert a new item is dramatically reduced from 545 in the contiguous overlapping buckets variant to 52 in our new non-contiguous bucket variant. In addition to the empirical results, we present a theoretical lower bound on the memory utilization of our variation as a function of the page size.
机译:Cuckoo Hashing [4]是一种多项选择散列方案,其中每个项目可以放置在多个位置,并且通过将项目移动到其替代位置来解决冲突。在双向CUCKOO散列的经典实现中,内存被划分为连续的不相交XED大小桶。每个项目都是散列到两个桶,并且可以存储在这些桶内的任何位置。参考。 [2]分析了铲斗是连续的并且重叠的变化。但是,许多系统从称为页面的相同大小块中从辅助存储中检索数据。获取页面是一个相对昂贵的过程,但一旦提取了一个页面,就可以更快地访问其内容。我们利用内存检索的此属性,呈现Cuckoo Hashing的变体,该变量包含以下约束:每个桶必须完全包含在单个页面中,但桶不一定是连续的。经验结果表明,此修改会提高内存利用率,并降低插入项目所需的迭代次数。如果每个项目都有两个容量两个桶,则页面大小为8,每个桶都完全包含在单个页面中,内存利用率在经典连续的不相交桶体内等于89.71%,连续重叠桶中的93.78%变体,在我们新的非连续铲斗变体中增加到97.46%。当内存利用率为92%并且我们使用广度休息搜索寻找空缺位置时,在我们的新非连续桶中,插入新项目所需的迭代次数从545中显着减少到52中的52变体。除了经验结果之外,我们还为页面大小的函数呈现了我们的变化的内存利用的理论下限。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号