...
首页> 外文期刊>SIAM Journal on Computing >The limits of buffering: A tight lower bound for dynamic membership in the external memory model
【24h】

The limits of buffering: A tight lower bound for dynamic membership in the external memory model

机译:缓冲的局限性:外部存储器模型中动态成员资格的严格下限

获取原文
获取原文并翻译 | 示例
           

摘要

We study the dynamic membership (or dynamic dictionary) problem, which is one of the most fundamental problems in data structures. We study the problem in the external memory model with cell size b bits and cache size m bits. We prove that if the amortized cost of updates is at most 0.9 (or any other constant < 1), then the query cost must be Ω(log_(blog n)(n/m)), where n is the number of elements in the dictionary. In contrast, when the update time is allowed to be 1 + o(1), then a bit vector or hash table gives query time O(1). Thus, this is a threshold phenomenon for data structures. This lower bound answers a folklore conjecture of the external memory community. Since almost any data structure task can solve membership, our lower bound implies a dichotomy between two alternatives: (i) make the amortized update time at least 1 (so the data structure does not buffer, and we lose one of the main potential advantages of the cache), or (ii) make the query time at least roughly logarithmic in n. Our result holds even when the updates and queries are chosen uniformly at random and there are no deletions; it holds for randomized data structures, holds when the universe size is O(n), and does not make any restrictive assumptions such as indivisibility. All of the lower bounds we prove hold regardless of the space consumption of the data structure, while the upper bounds need only linear space. The lower bound has some striking implications for external memory data structures. It shows that the query complexities of many problems, such as one-dimensional range counting, predecessor, rank-select, and many others, are all the same in the regime where the amortized update time is less than 1, as long as the cell size is large enough (b = polylog(n) suffices). The proof of our lower bound is based on a new combinatorial lemma called the lemma of surprising intersections (LOSI), which allows us to use a proof methodology where we first analyze the intersection structure of the positive queries by using encoding arguments, and then use statistical arguments to deduce properties of the intersection structure of all queries, even the negative ones. In most other data structure arguments that we know, it is difficult to argue anything about the negative queries. Therefore we believe that the LOSI and this proof methodology might find future uses for other problems.
机译:我们研究动态隶属度(或动态字典)问题,它是数据结构中最基本的问题之一。我们在单元大小为b位且缓存大小为m位的外部存储器模型中研究该问题。我们证明如果更新的摊销成本最多为0.9(或任何其他常量<1),则查询成本必须为Ω(log_(blog n)(n / m)),其中n是其中的元素数词典。相反,当允许更新时间为1 + o(1)时,位向量或哈希表将给出查询时间O(1)。因此,这是数据结构的阈值现象。这个下限回答了外部记忆社区的民间传说猜想。由于几乎所有数据结构任务都可以解决成员资格问题,因此我们的下限意味着两个选择之间的二分法:(i)使摊销的更新时间至少为1(因此数据结构不会缓冲,并且我们失去了以下的主要潜在优势之一)缓存),或(ii)使查询时间至少以n为对数。即使随机选择了统一的更新和查询,也没有删除,我们的结果仍然成立。它适用于随机数据结构,当Universe大小为O(n)时适用,并且不做任何不可分割的限制假设。我们证明所有下限都成立,而与数据结构的空间消耗无关,而上限仅需要线性空间。下限对外部存储器数据结构具有惊人的含义。它表明,只要单元格的摊销更新时间少于1,许多问题的查询复杂性(例如一维范围计数,前任,等级选择等)在相同的情况下都是相同的大小足够大(b = polylog(n)足够)。下界的证明基于一个新的组合引理,即令人惊讶的交集引理(LOSI),它使我们能够使用证明方法,在该方法中,我们首先通过使用编码参数来分析正查询的交集结构,然后使用统计参数推论所有查询甚至是否定查询的交集结构的属性。在我们知道的大多数其他数据结构参数中,很难对否定查询进行任何争论。因此,我们认为LOSI和这种证明方法可能会发现其他问题的未来用途。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号