首页> 外文会议>ACM international conference on information and knowledge management >An Analysis of Systematic Judging Errors in Information Retrieval
【24h】

An Analysis of Systematic Judging Errors in Information Retrieval

机译:信息检索中系统性判断错误的分析

获取原文

摘要

Test collections are powerful mechanisms for evaluation and optimization of information retrieval systems. There is reported evidence that experiment outcomes can be affected by changes in the judge population or in judging guidelines. We examine such effects in a web search setting, comparing the judgments of four groups of judges: NIST Web Track judges, untrained crowd workers and two groups of trained judges of a commercial search engine. Our goal is to identify systematic judging errors by comparing the labels contributed by the different groups. In particular, we focus on detecting systematic differences in judging depending on specific characteristics of the queries and URLs. For example, we ask whether a given population of judges, working under a given set of judging guidelines, are more likely to overrate Wikipedia pages than another group judging under the same instructions. Our approach is to identify judging errors with respect to a consensus set. a judged gold set and a set of user clicks. We further demonstrate how such biases can affect the training of retrieval systems.
机译:测试集合是用于评估和优化信息检索系统的强大机制。有证据表明,实验结果可能会受到评审人数或评审准则变化的影响。我们在网络搜索设置中检查了这种影响,比较了四组法官的判断:NIST Web Track法官,未经培训的人群工作者和两组经过训练的商业搜索引擎法官。我们的目标是通过比较不同小组贡献的标签来识别系统的判断错误。特别是,我们专注于根据查询和URL的特定特征在判断中检测系统差异。例如,我们问在给定的一组评审准则下工作的给定的法官群体是否比另一组在相同的指导下评审的法官更有可能高估维基百科页面。我们的方法是识别关于共识集的判断错误。判断的金牌设置和一组用户点击。我们进一步证明了这种偏见如何影响检索系统的训练。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号