【24h】

Online Result Cache Invalidation for Real-time Web Search

机译:实时Web搜索的在线结果缓存无效

获取原文

摘要

Caches of results are critical components of modern Web search engines, since they enable lower response time to frequent queries and reduce the load to the search engine back-end. Results in long-lived cache entries may become stale, however, as search engines continuously update their index to incorporate changes to the Web. Consequently, it is important to provide mechanisms that control the degree of staleness of cached results, ideally enabling the search engine to always return fresh results. In this paper, we present a new mechanism that identifies and invalidates query results that have become stale in the cache online. The basic: idea is to evaluate at query time and against recent changes if cache hits have had their results have changed. For enhancing invalidation efficiency, the generation time of cached queries and their chronological order with respect to the latest index update are used to early prune unaffected queries. We evaluate the proposed approach using documents that change over time and query logs of the Yahoo! search engine. We show that the proposed approach ensures good query results (50% fewer stale results) and high invalidation accuracy (90% fewer unnecessary invalidations) compared to a baseline approach that makes invalidation decisions off-line. More importantly, the proposed approach induces less processing overhead, ensuring an average throughput 73% higher than that of the baseline approach.
机译:结果缓存是现代Web搜索引擎的关键组成部分,因为它们可以缩短对频繁查询的响应时间,并减少搜索引擎后端的负载。但是,随着搜索引擎不断更新其索引以将更改合并到Web中,长期存在的缓存条目的结果可能会过时。因此,重要的是要提供一种机制来控制缓存结果的陈旧程度,理想地使搜索引擎始终返回最新结果。在本文中,我们提出了一种新的机制,该机制可以识别和使在线缓存中已过时的查询结果无效。基本的思想是,如果缓存命中的结果已更改,则在查询时和针对最近的更改进行评估。为了提高无效效率,可以使用缓存查询的生成时间及其相对于最新索引更新的时间顺序来早剪不受影响的查询。我们使用随时间变化的文档并查询Yahoo!的日志来评估建议的方法。搜索引擎。我们证明,与使离线决策无效的基线方法相比,该方法可确保良好的查询结果(陈旧的结果减少50%)和更高的无效准确性(不必要的无效减少90%)。更重要的是,所提出的方法减少了处理开销,从而确保平均吞吐量比基线方法高73%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号