【24h】

Probabilistic models for document retrieval

机译:文档检索的概率模型

获取原文

摘要

Probabilistic document retrieval systems consistent with the two Poisson independence model outperforms the binary independence model if the terms are distributed as described by the model's assumptions. The Two Poisson Effectiveness Hypothesis suggests that retrieval models based upon the two Poisson model will outperform binary independent models when used on a "real-world" database, where independence and two Poisson term occurrence distributions fail to hold, because the added information obtained from incorporating term frequency information will more than compensate for the non-Poisson distributions of terms. Searches of the MED1033 database suggest that if terms are not independent and frequencies of term occurrence are not distributed in a two Poisson manner, the binary independence sequential retrieval model outperforms the two Poisson independence retrieval model.

机译:如果条件是按照模型的假设描述的,则与两个Poisson独立性模型一致的概率文档检索系统的性能要优于二进制独立性模型。 “两个Poisson有效性假说”表明,在“真实世界”数据库中使用基于两个Poisson模型的检索模型时,它们的表现将优于独立于二进制的模型,在该数据库中,独立性和两个Poisson项的出现分布均不成立,因为通过合并获得的附加信息项频率信息将更多地补偿项的非泊松分布。 MED1033数据库的搜索表明,如果术语不是独立的并且术语出现的频率没有以两个Poisson方式分布,则二进制独立顺序检索模型将优于两个Poisson独立检索模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号