【24h】

Efficient Text Proximity Search

机译:高效文本邻近搜索

获取原文

摘要

In addition to purely occurrence-based relevance models, term proximity has been frequently used to enhance retrieval quality of keyword-oriented retrieval systems. While there have been approaches on effective scoring functions that incorporate proximity, there has not been much work on algorithms or access methods for their efficient evaluation. This paper presents an efficient evaluation framework including a proximity scoring function integrated within a top-k query engine for text retrieval. We propose precomputed and materialized index structures that boost performance. The increased retrieval effectiveness and efficiency of our framework are demonstrated through extensive experiments on a very large text benchmark collection. In combination with static index pruning for the proximity lists, our algorithm achieves an improvement of two orders of magnitude compared to a term-based top-k evaluation, with a significantly improved result quality.
机译:除了纯粹的基于相关的相关模型之外,术语接近经常用于增强关键字的检索系统的检索质量。虽然已经采用了有效评分功能的方法,但是,算法上没有多大的工作,以获得其有效评估的算法或访问方法。本文介绍了一个有效的评估框架,包括集成在Top-K查询引擎内的接近评分功能,用于文本检索。我们提出了预先提升和物化指数结构,以提高性能。通过在非常大的文本基准集合上进行广泛的实验,证明了我们框架的检索效力和效率增加。结合静态指数对接近列表进行修剪,与基于术语的Top-K评估相比,我们的算法实现了两个数量级的提高,结果质量显着提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号