...
首页> 外文期刊>ACM Transactions on Information Systems >Learning to Adaptively Rank Document Retrieval System Configurations
【24h】

Learning to Adaptively Rank Document Retrieval System Configurations

机译:学习对文档检索系统配置进行自适应排名

获取原文
获取原文并翻译 | 示例

摘要

Modern Information Retrieval (IR) systems have become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion parameters, whose values greatly influence the overall retrieval effectiveness. Traditionally, these parameters are set at a system level based on training queries, and the same parameters are then used for different queries. We observe that it may not be easy to set all these parameters separately, since they can be dependent. In addition, a global setting for all queries may not best fit all individual queries with different characteristics. The parameters should be set according to these characteristics. In this article, we propose a novel approach to tackle this problem by dealing with the entire system configurations (i.e., a set of parameters representing an IR system behaviour) instead of selecting a single parameter at a time. The selection of the best configuration is cast as a problem of ranking different possible configurations given a query. We apply learning-to-rank approaches for this task. We exploit both the query features and the system configuration features in the learning-to-rank method so that the selection of configuration is query dependent. The experiments we conducted on four TREC ad hoc collections show that this approach can significantly outperform the traditional method to tune system configuration globally (i.e., grid search) and leads to higher effectiveness than the top performing systems of the TREC tracks. We also perform an ablation analysis on the impact of different features on the model learning capability and show that query expansion features are among the most important for adaptive systems.
机译:现代信息检索(IR)系统已经变得越来越复杂,涉及大量参数。例如,系统可以从一组可能的检索模型(BM25,语言模型等)或各种查询扩展参数中进行选择,其值会极大地影响总体检索效率。传统上,这些参数是基于训练查询在系统级别设置的,然后将相同的参数用于不同的查询。我们注意到,单独设置所有这些参数可能并不容易,因为它们可能是相互依赖的。此外,针对所有查询的全局设置可能无法最好地适合具有不同特征的所有单个查询。参数应根据这些特性进行设置。在本文中,我们提出了一种新颖的方法来解决此问题,方法是处理整个系统配置(即代表IR系统行为的一组参数),而不是一次选择一个参数。最佳配置的选择被认为是对给定查询的不同可能配置进行排名的问题。我们为此应用了按等级学习的方法。我们在按等级学习方法中利用了查询功能和系统配置功能,因此配置的选择取决于查询。我们在四个TREC临时集合上进行的实验表明,这种方法可以大大优于传统的全局调优系统配置(即网格搜索)的方法,并且比TREC轨道的性能最佳的系统具有更高的效率。我们还对不同功能对模型学习功能的影响进行了消融分析,并表明查询扩展功能对于自适应系统而言是最重要的功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号