首页> 外文期刊>ACM SIGIR FORUM >Improving Retrieval Performance for Verbose ƒeries via Axiomatic Analysis of Term Discrimination Heuristic
【24h】

Improving Retrieval Performance for Verbose ƒeries via Axiomatic Analysis of Term Discrimination Heuristic

机译:通过术语歧视启发式的公理分析提高详细信息的检索性能

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Number of terms in a query is a query-speci€c constant that isrntypically ignored in retrieval functions. However, previous studiesrnhave shown that the performance of retrieval models varies forrndi‚erent query lengths, and it usually degrades when query lengthrnincreases. A possible reason for this issue can be the extraneousrnterms in longer queries that makes it a challenge for the retrievalrnmodels to distinguish between the key and complementary conceptsrnof the query. As a signal to understand the importance of arnterm, inverse document frequency (IDF) can be used to discriminaternquery terms. In this paper, we propose a constraint to model therninteraction between query length and IDF. Our theoretical analysisrnshows that current state-of-the-art retrieval models, such as BM25,rndo not satisfy the proposed constraint. We further analyze thernBM25 model and suggest a modi€cation to adapt BM25 so that itrnadheres to the new constraint. Our experiments on three TREC collectionsrndemonstrate that the proposed modi€cation outperformsrnthe baselines, especially for verbose queries.
机译:查询中的术语数是特定于查询的常量,在检索功能中通常会忽略该常量。但是,以前的研究表明,对于不同的查询长度,检索模型的性能会有所不同,并且当查询长度增大时,检索模型的性能通常会下降。此问题的可能原因可能是较长查询中的无关术语,这使得检索模型难以区分查询的关键概念和补充概念。为了理解arnterm的重要性,可以使用逆文档频率(IDF)来区分查询词。在本文中,我们提出了一个约束条件来对查询长度和IDF之间的交互进行建模。我们的理论分析表明,当前最先进的检索模型(例如BM25)不能满足建议的约束条件。我们进一步分析了BM25模型,并提出了一种修改以适应BM25,使其适应新的约束条件。我们在三个TREC集合上进行的实验表明,所提出的修改优于基线,特别是对于冗长的查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号