We study three comparison-based problems related to multisets in the cache-oblivious model: Duplicate elimination, multisorting and finding the most frequent element (the mode). We are interested in minimizing the cache complexity (or number of cache misses) of algorithms for these problems in the context under which cache size and block size are unknown. We give algorithms with cache complexities within a constant factor of the optimal for all the problems. In the case of determining the mode, the optimal algorithm is randomized as the deterministic algorithm differs from the lower bound by a sublogarithmic factor. We can achieve optimality either with a randomized method or if given, along with the input, lg lg of relative frequency of the mode with a constant additive error.
展开▼
机译:我们研究了与Cache忘记模型中的多项数据相关的基于比较的问题:重复消除,多间等,找到最常见的元素(模式)。我们有兴趣最大限度地减少算法的缓存复杂性(或高速缓存未命中数量)在缓存大小和块大小未知的上下文中。我们为所有问题的最佳恒定因子提供缓存复杂性的算法。在确定模式的情况下,随机化算法随着确定性算法与副阈值因素的下限不同。我们可以通过随机方法或如果给定,以及具有恒定的添加误差的模式的输入,LG LG LG LG的最佳状态。
展开▼