首页> 外文期刊>Information Processing & Management >A novel term weighting scheme based on discrimination power obtained from past retrieval results
【24h】

A novel term weighting scheme based on discrimination power obtained from past retrieval results

机译:一种基于过去检索结果判别力的术语加权新方案

获取原文
获取原文并翻译 | 示例
       

摘要

Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term's role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term's evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF'IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TFIDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models.
机译:数十年来,用于文档排名和检索的术语加权一直是信息检索中的重要研究主题。我们提出一种新的术语加权方法,该假设基于以下假设:一个术语在过去的累积检索会话中的角色会影响其总体重要性。它利用了过去检索结果的可用性,这些检索结果包括包含特定术语的查询,检索到的文档及其相关性判断。正如我们在本文中提出的,术语的证据权重取决于过去相关文档和非相关文档分布的平均频率值不同的程度。更准确地说,它考虑了相关文档和不相关文档的排名和相似性值。我们使用标准测试集合的实验结果表明,所提出的术语加权方案改进了传统的TF'IDF和基于语言模型的方案。它表明证据术语权重带来了术语重要性的新方面,并补充了基于TFIDF的馆藏统计数据。我们还展示了基于证据权重概念的术语权重方案与基于语言建模和概率模型的众所周知的权重方案如何相关。

著录项

  • 来源
    《Information Processing & Management》 |2012年第5期|p.919-930|共12页
  • 作者单位

    Korea Institute of Science and Technology Information, 245 Daehak-ro, Yuseong-gu, Daejeon 305-806, South Korea,Division of Web Science and Technology, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, South Korea;

    Division of Web Science and Technology, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon 305-701, South Korea;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    term weighting; evidential weight; discrimination power; language model; probabilistic model;

    机译:术语权重证据权重歧视权;语言模型概率模型;
  • 入库时间 2022-08-17 23:20:15

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号