首页> 外文期刊>Information Processing & Management >Supervised approaches for explicit search result diversification
【24h】

Supervised approaches for explicit search result diversification

机译:明确搜索结果的监督方法多样化

获取原文
获取原文并翻译 | 示例
           

摘要

Diversification of web search results aims to promote documents with diverse content (i.e., covering different aspects of a query) to the top-ranked positions, to satisfy more users, enhance fairness and reduce bias. In this work, we focus on the explicit diversification methods, which assume that the query aspects are known at the diversification time, and leverage supervised learning methods to improve their performance in three different frameworks with different features and goals. First, in the LTRDiv framework, we focus on applying typical learning to rank (LTR) algorithms to obtain a ranking where each top-ranked document covers as many aspects as possible. We argue that such rankings optimize various diversification metrics (under certain assumptions), and hence, are likely to achieve diversity in practice. Second, in the AspectRanker framework, we apply LTR for ranking the aspects of a query with the goal of more accurately setting the aspect importance values for diversification. As features, we exploit several pre- and post-retrieval query performance predictors (QPPs) to estimate how well a given aspect is covered among the candidate documents. Finally, in the LmDiv framework, we cast the diversification problem into an alternative fusion task, namely, the supervised merging of rankings per query aspect. We again use QPPs computed over the candidate set for each aspect, and optimize an objective function that is tailored for the diversification goal. We conduct thorough comparative experiments using both the basic systems (based on the well-known BM25 matching function) and the best-performing systems (with more sophisticated retrieval methods) from previous TREC campaigns. Our findings reveal that the proposed frameworks, especially AspectRanker and LmDiv, outperform both non-diversified rankings and two strong diversification baselines (i.e., xQuAD and its variant) in terms of various effectiveness metrics.
机译:Web搜索结果的多样化旨在促进具有不同内容的文档(即,覆盖查询的不同方面)到排名职位,以满足更多用户,增强公平性并减少偏见。在这项工作中,我们专注于明确的多样化方法,该方法假设查询方面在多样化时间内已知,并利用监督的学习方法来提高其具有不同特征和目标的三个不同框架的性能。首先,在LTRDIV框架中,我们专注于应用典型的学习来等级(LTR)算法,以获得每个排名的文档涵盖尽可能多的方面的排名。我们认为这些排名优化了各种多样化度量(在某些假设下),因此,可能在实践中实现多样性。其次,在ASPectranker框架中,我们应用LTR用于对查询的各个方面进行排名,目标是更准确地设置多样化的方面重要值。作为特征,我们利用了几种和检索后的查询性能预测因子(QPP)来估计候选文档中涵盖了一个方面的程度。最后,在LMDIV框架中,我们将多样化问题转化为替代融合任务,即每个查询方面的排名的监督合并。我们再次使用通过每个方面的候选集计算QPP,并优化为多样化目标量身定制的目标函数。我们使用基本系统(基于众所周知的BM25匹配功能)和来自以前的TREC广告系列的最佳性能(具有更复杂的检索方法)进行彻底的比较实验。我们的调查结果表明,在各种有效性指标方面,拟议的框架,特别是Aspectranker和LMDIV,优于非多元化排名和两个强大的多样化基线(即,XQUAD及其变体)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号