首页> 外文期刊>Information Processing & Management >Relevance-based entity selection for ad hoc retrieval
【24h】

Relevance-based entity selection for ad hoc retrieval

机译:基于相关性的实体选择以进行临时检索

获取原文
获取原文并翻译 | 示例

摘要

Recent developments have shown that entity-based models that rely on information from the knowledge graph can improve document retrieval performance. However, given the non-transitive nature of relatedness between entities on the knowledge graph, the use of semantic relatedness measures can lead to topic drift. To address this issue, we propose a relevance-based model for entity selection based on pseudo-relevance feedback, which is then used to systematically expand the input query leading to improved retrieval performance. We perform our experiments on the widely used TREC Web corpora and empirically show that our proposed approach to entity selection significantly improves ad hoc document retrieval compared to strong baselines. More concretely, the contributions of this work are as follows: (1) We introduce a graphical probability model that captures dependencies between entities within the query and documents. (2) We propose an unsupervised entity selection method based on the graphical model for query entity expansion and then for ad hoc retrieval. (3) We thoroughly evaluate our method and compare it with the state-of-the-art keyword and entity based retrieval methods. We demonstrate that the proposed retrieval model shows improved performance over all the other baselines on ClueWeb09B and ClueWeb12B, two widely used Web corpora, on the NDCG@20, and ERR@20 metrics. We also show that the proposed method is most effective on the difficult queries. In addition, We compare our proposed entity selection with a state-of-the-art entity selection technique within the context of ad hoc retrieval using a basic query expansion method and illustrate that it provides more effective retrieval for all expansion weights and different number of expansion entities.
机译:最近的发展表明,依赖于知识图中信息的基于实体的模型可以提高文档检索性能。但是,鉴于知识图上实体之间相关性的非传递性,使用语义相关性度量可能导致主题漂移。为解决此问题,我们提出了一种基于基于相关性的模型,用于基于伪相关性反馈的实体选择,然后将其用于系统地扩展输入查询,从而提高检索性能。我们在广泛使用的TREC Web语料库上进行了实验,并根据经验表明,与强基准相比,我们提出的实体选择方法显着改善了临时文档检索。更具体地说,这项工作的贡献如下:(1)我们引入了一个图形概率模型,该模型捕获了查询中实体与文档之间的依赖关系。 (2)我们提出了一种基于图形模型的无监督实体选择方法,用于查询实体扩展,然后用于临时检索。 (3)我们全面评估我们的方法,并将其与最新的关键字和基于实体的检索方法进行比较。我们证明,提出的检索模型在NDCG @ 20和ERR @ 20度量标准的两个广泛使用的Web语料库ClueWeb09B和ClueWeb12B上显示了优于所有其他基准的性能。我们还表明,所提出的方法对困难查询最为有效。此外,我们在使用基本查询扩展方法进行临时检索的情况下,将我们建议的实体选择与最新的实体选择技术进行了比较,并说明了它为所有扩展权重和不同数量的索引提供了更有效的检索扩展实体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号