首页> 外文学位 >Upper and Lower Bounds for Sorting and Searching in External Memory.
【24h】

Upper and Lower Bounds for Sorting and Searching in External Memory.

机译:用于在外部存储器中进行排序和搜索的上限和下限。

获取原文
获取原文并翻译 | 示例

摘要

This dissertation presents variants on sorting and searching in external memory.;In the first part of the dissertation, we derive lower and upper bounds on sorting with different-sized records. We show that the record size substantially affects the sorting complexity, and so does the final interleaving of the smaller and larger records in the final sorted sequence: sorting costs more when large and small records are segregated than when they are interleaved in the final sorted order.;In the second part of the dissertation, we study the batched predecessor problem in external memory. Given the underlying sorted set S of size n, and a sorted query Q of size x ≤ nc, 0 ≤ c < 1, we study tradeoffs between the searching cost, and the cost to preprocess S. We give lower bounds in three external memory models: the I/O comparison model, I/O pointer-machine model, and the indexability model. Our results show that in the I/O comparison model, the batched predecessor problem needs O(logBn + 1/B) I/Os per element if the preprocessing is polynomial; with exponential preprocessing, the problem can be solved faster, in theta((log2n + 1)/B).;In the third part of the dissertation, we introduce alternatives to the well-known Bloom filter data structure. The quotient filter is designed for RAM, but with better data locality than the Bloom filter. The buffered quotient filter and the cascade filter are SSD-optimized alternatives to the Bloom filter. In experiments, the cascade filter and buffered quotient filter performed insertions 8.6-11 times faster than the fastest Bloom filter variant and performed lookups 0.94-2.56 times faster.
机译:本文提出了在外部存储器中排序和搜索的变体。在论文的第一部分,我们推导了不同大小记录的排序的上下界。我们显示记录的大小会极大地影响排序的复杂性,最终排序序列中较小和较大记录的最终交错也会影响:分离大小记录的排序成本要高于按最终排序顺序交错的大小记录。;在论文的第二部分,我们研究了外部存储器中的批量前驱问题。给定大小为n的基础排序集S和大小为x≤nc,0≤c <1的排序查询Q,我们研究了搜索成本与预处理S的成本之间的折衷。我们在三个外部存储器中给出了下界模型:I / O比较模型,I / O指针机器模型和可索引性模型。我们的结果表明,在I / O比较模型中,如果预处理是多项式,则批处理的前任问题每个元素需要O(logBn + 1 / B)I / O;通过指数预处理,可以在theta((log2n +1)/ B)中更快地解决问题。在论文的第三部分,我们介绍了著名的Bloom过滤器数据结构的替代方法。商过滤器是为RAM设计的,但其数据局部性比Bloom过滤器更好。缓冲商过滤器和级联过滤器是Bloom过滤器的SSD优化替代方案。在实验中,级联过滤器和缓冲商过滤器的插入速度比最快的Bloom过滤器变体快8.6-11倍,而查找速度则快0.94-2.56倍。

著录项

  • 作者

    Medjedovic, Dzejla.;

  • 作者单位

    State University of New York at Stony Brook.;

  • 授予单位 State University of New York at Stony Brook.;
  • 学科 Computer science.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 83 p.
  • 总页数 83
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号