首页> 外文OA文献 >Approximating Hit Rate Curves using Streaming Algorithms
【2h】

Approximating Hit Rate Curves using Streaming Algorithms

机译:使用流算法逼近命中率曲线

摘要

A hit rate curve is a function that maps cache size to the proportion of requests that can be served from the cache. (The caching policy and sequence of requests are assumed to be fixed.) Hit rate curves have been studied for decades in the operating system, database and computer architecture communities. They are useful tools for designing appropriate cache sizes, dynamically allocating memory between competing caches, and for summarizing locality properties of the request sequence. In this paper we focus on the widely-used LRU caching policy.Computing hit rate curves is very efficient from a runtime standpoint, but existing algorithms are not efficient in their space usage. For a stream of m requests for n cacheable objects, all existing algorithms that provably compute the hit rate curve use space linear in n. In the context of modern storage systems, n can easily be in the billions or trillions, so the space usage of these algorithms makes them impractical.We present the first algorithm for provably approximating hit rate curves for the LRU policy with sublinear space. Our algorithm uses O( p^2 * log(n) * log^2(m) / epsilon^2 ) bits of space and approximates the hit rate curve at p uniformly-spaced points to within additive error epsilon. This is not far from optimal. Any single-pass algorithm with the same guarantees must use Omega(p^2 + epsilon^{-2} + log(n)) bits of space. Furthermore, our use of additive error is necessary. Any single-pass algorithm achieving multiplicative error requires Omega(n) bits of space.
机译:命中率曲线是将高速缓存大小映射到可从高速缓存处理的请求比例的函数。 (假定缓存策略和请求顺序是固定的。)数十年来,在操作系统,数据库和计算机体系结构社区中研究了命中率曲线。它们是用于设计适当的缓存大小,在竞争的缓存之间动态分配内存以及汇总请求序列的局部性属性的有用工具。在本文中,我们着眼于广泛使用的LRU缓存策略。从运行时的角度来看,计算命中率曲线非常有效,但是现有算法在空间使用方面效率不高。对于n个可缓存对象的m个请求流,可证明计算命中率曲线的所有现有算法都使用n中的线性空间。在现代存储系统的情况下,n很容易达到数十亿或数万亿美元,因此这些算法的空间使用情况使它们不切实际。我们提出了第一种算法,可证明近似地估计具有亚线性空间的LRU策略的命中率曲线。我们的算法使用O(p ^ 2 * log(n)* log ^ 2(m)/ epsilon ^ 2)位空间,并将p个均匀间隔点的命中率曲线近似于加性误差epsilon之内。这离优化并不遥远。具有相同保证的任何单遍算法都必须使用Omega(p ^ 2 + epsilon ^ {-2} + log(n))位空间。此外,我们必须使用加法误差。任何实现乘法误差的单遍算法都需要Omega(n)位空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号