首页> 美国政府科技报告 >IRRA at TREC 2009: Index Term Weighting based on Divergence From Independence Model
【24h】

IRRA at TREC 2009: Index Term Weighting based on Divergence From Independence Model

机译:2009年TREC的IRRa:基于独立模型的分歧的指数期权加权

获取原文

摘要

IRRA (IR-Ra) group participated in the 2009 Web track (both adhoc task and diversity task) and the Million Query track. In this year, the major concern is to examine the effectiveness of a novel, nonparametric index term weighting model, divergence from independence (DFI). The notion of independence, which is the notion behind the well-known statistical exploratory data analysis technique called the correspondence analysis (Greenacre, 1984; Jambu, 1991), can be adapted to the index term weighting problem. In this respect, it can be thought of as a qualitative description of the importance of terms for documents, in which they appear, importance in the sense of contribution to the information contents of documents relative to other terms. According to the independence notion, if the ratios of the frequencies of two different terms are the same across documents, they are independent from documents. For example, each Web page contains a pair of 'html' and a pair of 'body' tags, so that the ratio of frequencies of these tags is the same across all Web pages, indicating that the 'html' and 'body' tags are independent from Web pages.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号