首页> 外文期刊>The VLDB journal >Morton filters: fast, compressed sparse cuckoo filters
【24h】

Morton filters: fast, compressed sparse cuckoo filters

机译:Morton过滤器:快速压缩的稀疏杜鹃过滤器

获取原文
获取原文并翻译 | 示例
           

摘要

Approximate set membership data structures (ASMDSs) are ubiquitous in computing. They trade a tunable, often small, error rate (epsilondocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$epsilon $$end{document}) for large space savings. The canonical ASMDS is the Bloom filter, which supports lookups and insertions but not deletions in its simplest form. Cuckoo filters (CFs), a recently proposed class of ASMDSs, add deletion support and often use fewer bits per item for equal epsilondocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$epsilon $$end{document}. This work introduces the Morton filter (MF), a novel CF variant that introduces several key improvements to its progenitor. Like CFs, MFs support lookups, insertions, and deletions, and when using an optional batching interface raise their respective throughputs by up to 2.5xdocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$imes $$end{document}, 20.8xdocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$imes $$end{document}, and 1.3xdocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$imes $$end{document}. MFs achieve these improvements by (1) introducing a compressed block format that permits storing a logically sparse filter compactly in memory, (2) leveraging succinct embedded metadata to prune unnecessary memory accesses, and (3) more heavily biasing insertions to use a single hash function. With these optimizations, lookups, insertions, and deletions often only require accessing a single hardware cache line from the filter. MFs and CFs are then extended to support self-resizing, a feature of quotient filters (another ASMDS that uses fingerprints). MFs self-resize up to 13.9xdocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$imes $$end{document} faster than rank-and-select quotient filters (a state-of-the-art self-resizing filter). These improvements are not at a loss in space efficiency, as MFs typically use comparable to slightly less space than CFs for equal epsilondocumentclass[12pt]{minimal} usepackage{amsmath} usepackage{wasysym} usepackage{amsfonts} usepackage{amssymb} usepackage{amsbsy} usepackage{mathrsfs} usepackage{upgreek} setlength{oddsidemargin}{-69pt} egin{document}$$epsilon $$end{document}.
机译:近似集合成员数据结构(ASMDS)在计算中无处不在。它们的交易出错率通常是可调的(epsilon documentclass [12pt] {minimum} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts} usepackage {amssymb} usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek} setlength { oddsidemargin} {-69pt} begin {document} $$ epsilon $$ end {document}),以节省大量空间。规范的ASMDS是Bloom过滤器,它支持查找和插入,但不支持最简单形式的删除。布谷鸟过滤器(CF)是最近提出的ASMDS类,它增加了删除支持,并且在相等的epsilon documentclass [12pt] {minimum} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts}中,每项通常使用较少的位 usepackage {amssymb} usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek} setlength { oddsidemargin} {-69pt} begin {document} $$ epsilon $$ end {document}。这项工作介绍了Morton滤波器(MF),它是一种新颖的CF变体,对其祖先进行了一些关键改进。像CF一样,MF支持查找,插入和删除,并且在使用可选的批处理接口时,其各自的吞吐量最多提高2.5x documentclass [12pt] {minimal} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts } usepackage {amssymb} usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek} setlength { oddsidemargin} {-69pt} begin {document} $$ times $$ end {document},20.8倍 documentclass [12pt] {minimum} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts} usepackage {amssymb} usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek} setlength { oddsidemargin} { -69pt} begin {document} $$ times $ end {document}和1.3x documentclass [12pt] {minimum} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts} usepackage {amssymb } usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek} setlength { oddsidemargin} {-69pt} begin {document} $$ times end {document}。 MF通过(1)引入允许将逻辑稀疏过滤器紧凑地存储在内存中的压缩块格式来实现这些改进,(2)利用简洁的嵌入式元数据来修剪不必要的内存访问,以及(3)更加偏向于插入以使用单个哈希功能。通过这些优化,查找,插入和删除操作通常仅需要从过滤器访问单个硬件缓存行。然后扩展MF和CF,以支持自调整大小,这是商过滤器(另一个使用指纹的ASMDS)的功能。 MF自行调整大小,最大13.9x documentclass [12pt] {最小} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts} usepackage {amssymb} usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek } setlength { oddsidemargin} {-69pt} begin {document} $$ times $$ end {document}比排名和选择商过滤器(最先进的自动调整大小过滤器)更快。这些改进不会损失空间效率,因为对于相同的epsilon documentclass [12pt] {minimal} usepackage {amsmath} usepackage {wasysym} usepackage {amsfonts} usepackage,MF通常使用的空间要比CF少得多{amssymb} usepackage {amsbsy} usepackage {mathrsfs} usepackage {upgreek} setlength { oddsidemargin} {-69pt} begin {document} $$ epsilon $$ end {document}。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号