首页> 外文期刊>Journal of the American Society for Information Science and Technology >A Simple Kernel Co-Occurrence-Based Enhancement for Pseudo-Relevance Feedback
【24h】

A Simple Kernel Co-Occurrence-Based Enhancement for Pseudo-Relevance Feedback

机译:基于简单核共现的伪相关反馈增强

获取原文
获取原文并翻译 | 示例
       

摘要

In this article a kernel co-occurrence-based framework was proposed in which term co-occurrence information is integrated into the classic Rocchio model and a relevance model (RM3) to achieve enhanced retrieval performance. When selecting and weighting the candidate terms from feedback documents, we used a linear combination of model components to balance the influence of classic models and models that capture term co-occurrence information to achieve better performance. Thus, two kernel co-occurrence-based methods, KRoc and KRM3, are proposed. In particular, to better utilize the term co-occurrence information, we incorporated this information into the whole term weight formula by simultaneously refining both the factor of the term discriminating power and the factor of the within-document term weight in feedback documents to achieve better performance. The experimental results show that the proposed KRoc and KRM3 methods are effective and outperform the corresponding strong baseline methods in terms of the MAP and P@10 results on most collections used for testing. Meanwhile, our proposed methods are at least comparable to the state-of-the-art TF-PRF, PRoc, IF&FB, and MRF models. Additionally, we carefully analyzed the influence of σ on our proposed KRoc and KRM3methods, and an empirical rule for setting this parameter to achieve good performance is suggested.
机译:在本文中,提出了一个基于内核共现的框架,其中将术语共现信息集成到经典的Rocchio模型和关联模型(RM3)中,以实现增强的检索性能。从反馈文档中选择候选术语并对其进行加权时,我们使用模型组件的线性组合来平衡经典模型和捕获术语共现信息的模型的影响,以实现更好的性能。因此,提出了两种基于核共现的方法,即KRoc和KRM3。特别是,为了更好地利用术语共现信息,我们通过同时完善术语区分能力的因素和反馈文档中文档内术语权重的因素,将此信息合并到整个术语权重公式中,以实现更好的效果性能。实验结果表明,在大多数用于测试的集合中,所提出的KRoc和KRM3方法有效且优于相应的强基线方法。同时,我们提出的方法至少可以与最新的TF-PRF,PRoc,IF&FB和MRF模型相提并论。此外,我们仔细分析了σ对我们提出的KRoc和KRM3方法的影响,并提出了设置此参数以获得良好性能的经验规则。

著录项

  • 来源
  • 作者单位

    Information Retrieval and Knowledge Management Research Lab National Engineering Research Center for E-Learning Central China Normal University Wuhan China and School of Computer and Information Engineering Hubei Normal University Huangshi China and Information Retrieval and Knowledge Management Research Lab School of Information Technology York University Toronto ON Canada;

    Information Retrieval and Knowledge Management Research Lab School of Information Technology York University Toronto ON Canada;

    Information Retrieval and Knowledge Management Research Lab School of Computer Central China Normal University Wuhan China;

    School of Computer and Information Engineering Hubei Normal University Huangshi China and Information Retrieval and Knowledge Management Research Lab School of Computer Central China Normal University Wuhan China;

    Information Retrieval and Knowledge Management Research Lab School of Information Technology York University Toronto ON Canada and Information Retrieval and Knowledge Management Research Lab School of Information Management Central China Normal University Wuhan China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-18 05:24:19

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号