首页> 外文期刊>ACM SIGIR FORUM >Statistical Significance Testing in Information Retrieval: Theory and Practice
【24h】

Statistical Significance Testing in Information Retrieval: Theory and Practice

机译:信息检索中的统计显着性检验:理论与实践

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Œe past 20 years have seen a great improvement in the rigor ofrninformation retrieval experimentation, due primarily to two factors:rnhigh-quality, public, portable test collections such as thosernproduced by TREC (the Text REtrieval Conference [38]), and thernincreased practice of statistical hypothesis testing to determinernwhether measured improvements can be ascribed to somethingrnother than random chance. Together these create a very usefulrnstandard for reviewers, program commiŠees, and journal editors;rnwork in information retrieval (IR) increasingly cannot be publishedrnunless it has been evaluated using a well-constructed test collectionrnand shown to produce a statistically signi€cant improvement overrna good baseline.rnBut, as the saying goes, any tool sharp enough to be useful isrnalso sharp enough to be dangerous. Statistical tests of signi€cancernare widely misunderstood. Most researchers and developers treatrnthem as a “black box”: evaluation results go in and a p-value comesrnout. But because signi€cance is such an important factor in determiningrnwhat research directions to explore and what is published,rnusing p-values obtained without thought can have consequencesrnfor everyone doing research in IR. Ioannidis has argued that thernmain consequence in the biomedical sciences is that most publishedrnresearch €ndings are false [20]; could that be the case in IR as well?
机译:在过去的20年中,信息检索实验的严谨性有了很大的提高,这主要归因于以下两个因素:高质量,公开,便携式的测试集,例如TREC(文本检索会议[38])生产的那些,以及实践的增加统计假设检验以确定是否可以将测量的改进归因于随机机会以外的其他因素。这些共同为审稿人,程序委员会和期刊编辑创建了非常有用的标准;信息检索(IR)的工作越来越难以发布,除非已使用结构良好的测试集对其进行了评估,并证明该改进产生了统计学上的显着改善良好的基线。但是,俗话说,任何足够有用的工具也要足够危险。对意义的统计检验普遍被误解了。大多数研究人员和开发人员将其视为“黑匣子”:评估结果进入,p值出现。但是,由于重要性是决定要探索的研究方向和发表内容的重要因素,因此,使用未经考虑而获得的p值可能会对每个在IR中进行研究的人产生影响。约阿尼迪斯认为,生物医学的主要后果是大多数已发表的研究发现都是错误的[20]。在IR中也是如此吗?

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号