Learning to Adaptively Rank Document Retrieval System Configurations

Deveaud Romain; Mothe Josiane; Ullah Md Zia; Nie Jian-Yun

首页> 外文期刊>ACM Transactions on Information Systems >Learning to Adaptively Rank Document Retrieval System Configurations

【24h】

Learning to Adaptively Rank Document Retrieval System Configurations

机译：学习对文档检索系统配置进行自适应排名

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Modern Information Retrieval (IR) systems have become more and more complex, involving a large number of parameters. For example, a system may choose from a set of possible retrieval models (BM25, language model, etc.), or various query expansion parameters, whose values greatly influence the overall retrieval effectiveness. Traditionally, these parameters are set at a system level based on training queries, and the same parameters are then used for different queries. We observe that it may not be easy to set all these parameters separately, since they can be dependent. In addition, a global setting for all queries may not best fit all individual queries with different characteristics. The parameters should be set according to these characteristics. In this article, we propose a novel approach to tackle this problem by dealing with the entire system configurations (i.e., a set of parameters representing an IR system behaviour) instead of selecting a single parameter at a time. The selection of the best configuration is cast as a problem of ranking different possible configurations given a query. We apply learning-to-rank approaches for this task. We exploit both the query features and the system configuration features in the learning-to-rank method so that the selection of configuration is query dependent. The experiments we conducted on four TREC ad hoc collections show that this approach can significantly outperform the traditional method to tune system configuration globally (i.e., grid search) and leads to higher effectiveness than the top performing systems of the TREC tracks. We also perform an ablation analysis on the impact of different features on the model learning capability and show that query expansion features are among the most important for adaptive systems.

机译：现代信息检索（IR）系统已经变得越来越复杂，涉及大量参数。例如，系统可以从一组可能的检索模型（BM25，语言模型等）或各种查询扩展参数中进行选择，其值会极大地影响总体检索效率。传统上，这些参数是基于训练查询在系统级别设置的，然后将相同的参数用于不同的查询。我们注意到，单独设置所有这些参数可能并不容易，因为它们可能是相互依赖的。此外，针对所有查询的全局设置可能无法最好地适合具有不同特征的所有单个查询。参数应根据这些特性进行设置。在本文中，我们提出了一种新颖的方法来解决此问题，方法是处理整个系统配置（即代表IR系统行为的一组参数），而不是一次选择一个参数。最佳配置的选择被认为是对给定查询的不同可能配置进行排名的问题。我们为此应用了按等级学习的方法。我们在按等级学习方法中利用了查询功能和系统配置功能，因此配置的选择取决于查询。我们在四个TREC临时集合上进行的实验表明，这种方法可以大大优于传统的全局调优系统配置（即网格搜索）的方法，并且比TREC轨道的性能最佳的系统具有更高的效率。我们还对不同功能对模型学习功能的影响进行了消融分析，并表明查询扩展功能对于自适应系统而言是最重要的功能。

著录项

来源
《ACM Transactions on Information Systems》 |2019年第1期|3.1-3.41|共41页
作者
Deveaud Romain; Mothe Josiane; Ullah Md Zia; Nie Jian-Yun;
展开▼
作者单位

Univ Toulouse, UMR5505 CNRS, IRIT, 118 Route Narbonne, F-31062 Toulouse, France;

Univ Montreal, Dept Informat & Rech Operat, CP 6128,Succ Ctr Ville, Montreal, PQ, Canada;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Information systems; information retrieval; learning to rank; retrieval system parameters; adaptive information retrieval; query features; data analytics;

机译：信息系统;信息检索;等级学习;检索系统参数;自适应信息检索;查询功能;数据分析;

相似文献

外文文献
中文文献
专利

1. Semi-active learning to rank algorithms for document retrieval [J] . Dammak Faiza, Kammoun Hager, Hmid Sawssen Ben, International journal of intelligent information and database systems . 2017,第3a4期

机译：半主动学习对文档检索算法进行排名
2. Online Fast Adaptive Low-Rank Similarity Learning for Cross-Modal Retrieval [J] . Wu Yiling, Wang Shuhui, Huang Qingming IEEE transactions on multimedia . 2020,第5期

机译：在线快速自适应低级相似性学习，用于交叉模态检索
3. Learning to rank with document ranks and scores [J] . Yan Pan, Hai-Xia Luo, Yong Tang, Knowledge-Based Systems . 2011,第4期

机译：学习按文档等级和分数进行排名
4. Improving pairwise learning to rank algorithms for document retrieval [C] . Faïza Dammak, Hager Kammoun, Abdelmajid Ben Hamadou IEEE Symposium Series on Computational Intelligence . 2017

机译：改进成对学习对文档检索算法进行排名
5. Learning to rank documents with support vector machines via active learning. [D] . Arens, Robert James. 2009

机译：通过主动学习，使用支持向量机学习对文档进行排名。
6. Learning to rank diversified results for biomedical information retrieval from multiple features [O] . Jiajin Wu, Jimmy Xiangji Huang, Zheng Ye 2014

机译：学习对生物特征信息检索的多种结果进行排序
7. Simulation analysis for interactive retrieval of spoken documents with key terms ranked by reinforcement learning [O] . Yi-cheng Pan, Lin-shan Lee 2006

机译：基于强化学习的关键词交互式检索口语文献的仿真分析

Learning to Adaptively Rank Document Retrieval System Configurations

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅