首页> 外文会议>Research and advanced technology for digital libraries >A Linguistically Motivted Probabilistic Model of Information Retrieval
【24h】

A Linguistically Motivted Probabilistic Model of Information Retrieval

机译:语言动机的信息检索概率模型

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

This paper presents a new probabilistic model of information retrieval. The most importatn modeling assumption made is that documents and queries are defiend by an ordered sequence of single terms. This assumption is not made in well known existing models of information retrieval, but is essential in the field of statistical natural language processing. Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tfxidf term weighting. The paper shows that the new probabilistic interpretation tfx idf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking. A pilot experiment on the Cranfield test collection indicates that the presented model outperforms the vector space model with classical tfxidf and cosine length normalisation.
机译:本文提出了一种新的信息检索概率模型。最重要的建模假设是文档和查询被单个术语的有序序列所侵害。该假设不是在众所周知的现有信息检索模型中进行的,而是在统计自然语言处理领域中必不可少的。统计自然语言处理中已经取得的进展将在本文中用于阐述使用tfxidf术语加权的概率论证。本文表明,新的概率解释tfx idf术语加权可能会导致更好地了解统计排名机制,例如通过解释它们与协调级别排名之间的关系。在Cranfield测试集合上进行的先导实验表明,通过经典的tfxidf和余弦长度归一化,所提出的模型优于向量空间模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号