This paper discuses effciency and effectivenes issues in caching the results of queries submitted to a Web Search Engine (WSE). We propose SDC, a new caching strategy aimed to effciently exploit the temporal and spatial locality present in the stream of processed queries. SDC stores the results of the most frequently submitted queries in a static, read-only portion of the cache, while the queries that cannot be satisfied by the static portion compete for the remaining entries of the cache according to a given replacement policy. Moreover, we improved the hit-ratio of SDC by using a speculative prefetching strategy, which anticipates future requests by introducing a limited overhead over the backend WSE. We experimentally demonstrated the superiority of SDC over purely static and dynamic policies by measuring the hit-ratio achieved on three large query logs by varying the cache parameters and the replacement policy used for managing the dynamic part of the cache. Finally, we deployed and measured the throughput achieved by a concurrent version of our caching system. Our tests showed how the SDC cache can be efficiently exploited by several threads that concurrently serve the queries of different users.
展开▼