首页> 外文期刊>Computational Intelligence >AN ASPECT QUERY LANGUAGE MODEL BASED ON QUERY DECOMPOSITION AND HIGH-ORDER CONTEXTUAL TERM ASSOCIATIONS
【24h】

AN ASPECT QUERY LANGUAGE MODEL BASED ON QUERY DECOMPOSITION AND HIGH-ORDER CONTEXTUAL TERM ASSOCIATIONS

机译:基于查询分解和高阶上下文关联的查询语言模型

获取原文
获取原文并翻译 | 示例

摘要

In information retrieval (IR) research, more and more focus has been placed on optimizing a query language model by detecting and estimating the dependencies between the query and the observed terms occurring in the selected relevance feedback documents. In this paper, we propose a novel Aspect Language Modeling framework featuring term association acquisition, document segmentation, query decomposition, and an Aspect Model (AM) for parameter optimization. Through the proposed framework, we advance the theory and practice of applying high-order and context-sensitive term relationships to IR. We first decompose a query into subsets of query terms. Then we segment the relevance feedback documents into chunks using multiple sliding windows. Finally we discover the higher order term associations, that is, the terms in these chunks with high degree of association to the subsets of the query. In this process, we adopt an approach by combining the AM with the Association Rule (AR) mining. In our approach, the AM not only considers the subsets of a query as "hidden" states and estimates their prior distributions, but also evaluates the dependencies between the subsets of a query and the observed terms extracted from the chunks of feedback documents. The AR provides a reasonable initial estimation of the high-order term associations by discovering the associated rules from the document chunks. Experimental results on various TREC collections verify the effectiveness of our approach, which significantly outperforms a baseline language model and two state-of-the-art query language models namely the Relevance Model and the Information Flow model.
机译:在信息检索(IR)研究中,越来越多的重点放在通过检测和估计查询与所选相关性反馈文档中出现的观察词之间的依赖关系来优化查询语言模型。在本文中,我们提出了一种新颖的方面语言建模框架,该框架具有术语关联获取,文档分段,查询分解以及用于参数优化的方面模型(AM)。通过提出的框架,我们推进了将高阶和上下文相关的术语关系应用于IR的理论和实践。我们首先将查询分解为查询字词的子集。然后,我们使用多个滑动窗口将相关性反馈文档细分为多个块。最后,我们发现了更高阶的术语关联,即这些块中与查询子集具有高度关联的术语。在此过程中,我们采用将AM与关联规则(AR)挖掘相结合的方法。在我们的方法中,AM不仅将查询的子集视为“隐藏”状态并估计其先验分布,而且还评估查询的子集与从反馈文档块中提取的观察到的术语之间的依赖性。通过从文档块中发现关联的规则,AR提供了对高阶术语关联的合理初始估计。在各种TREC集合上的实验结果证明了我们方法的有效性,该方法明显优于基准语言模型和两个最新的查询语言模型,即相关性模型和信息流模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号