首页> 外文会议>Data Compression Conference (DCC), 2012 >A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time
【24h】

A Cuckoo Hashing Variant with Improved Memory Utilization and Insertion Time

机译:具有改善的内存利用率和插入时间的杜鹃哈希变体

获取原文
获取原文并翻译 | 示例

摘要

Cuckoo hashing [4] is a multiple choice hashing scheme in which each item can be placed in multiple locations, and collisions are resolved by moving items to their alternative locations. In the classical implementation of two-way cuckoo hashing, the memory is partitioned into contiguous disjoint xed-size buckets. Each item is hashed to two buckets, and may be stored in any of the positions within those buckets. Ref. [2] analyzed a variation in which the buckets are contiguous and overlap. However, many systems retrieve data from secondary storage in same-size blocks called pages. Fetching a page is a relatively expensive process, but once a page is fetched, its contents can be accessed orders of magnitude faster. We utilize this property of memory retrieval, presenting a variant of cuckoo hashing incorporating the following constraint: each bucket must be fully contained in a single page, but buckets are not necessarily contiguous. Empirical results show that this modification increases memory utilization and decreases the number of iterations required to insert an item. If each item is hashed to two buckets of capacity two, the page size is 8, and each bucket is fully contained in a single page, the memory utilization equals 89.71% in the classical contiguous disjoint bucket variant, 93.78% in the contiguous overlapping bucket variant, and increases to 97.46% in our new non-contiguous bucket variant. When the memory utilization is 92% and we use breadth rest search to look for a vacant position, the number of iterations required to insert a new item is dramatically reduced from 545 in the contiguous overlapping buckets variant to 52 in our new non-contiguous bucket variant. In addition to the empirical results, we present a theoretical lower bound on the memory utilization of our variation as a function of the page size.
机译:布谷鸟哈希[4]是一种多选哈希方案,其中每个项目都可以放置在多个位置,并且通过将项目移动到它们的替代位置来解决冲突。在双向杜鹃哈希的经典实现中,内存被划分为连续的不相交的固定大小的存储桶。每个项目都散列到两个存储桶中,并且可以存储在这些存储桶中的任何位置。参考[2]分析了一个变体,其中铲斗是连续的和重叠的。但是,许多系统从辅助存储中以相同大小的块(称为页面)检索数据。提取页面是一个相对昂贵的过程,但是一旦提取了页面,就可以更快地访问其内容。我们利用内存检索的此属性,提出了布谷鸟哈希的变体,它包含以下约束:每个存储桶必须完全包含在单个页面中,但是存储桶不一定是连续的。实验结果表明,此修改可以提高内存利用率,并减少插入项目所需的迭代次数。如果将每个项目散列到容量为2的两个存储桶,页面大小为8,并且每个存储桶完全包含在单个页面中,则经典连续不连续存储桶变体中的内存利用率为89.71%,连续重叠存储桶中的内存利用率为93.78%变体,在我们的新非连续式桶变体中增加到97.46%。当内存利用率为92%,并且我们使用广度余地搜索来查找空位时,插入新项所需的迭代次数从连续重叠存储桶变体中的545大大减少到我们新的非连续存储桶中的52变体。除了经验结果外,我们还提供了根据页大小而变化的内存利用率的理论下限。

著录项

  • 来源
    《Data Compression Conference (DCC), 2012》|2012年|p.347- 356|共10页
  • 会议地点 Snowbird UT(US)
  • 作者

    Porat E.;

  • 作者单位

    Dept. of Comput. Sci., Bar Ilan Univ., Ramat Gan, Israel;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 TP311.56;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号