首页> 外文会议>33rd annual international ACM SIGIR conference on research and development in information retrieval 2010 >Retrieval System Evaluation: Automatic Evaluation versus Incomplete Judgments
【24h】

Retrieval System Evaluation: Automatic Evaluation versus Incomplete Judgments

机译:检索系统评估:自动评估与不完整的判断

获取原文
获取原文并翻译 | 示例

摘要

In information retrieval (IR), research aiming to reduce the cost of retrieval system evaluations has been conducted along two lines: (i) the evaluation of IR systems with reduced (i.e. incomplete) amounts of manual relevance assessments, and (ii) the fully automatic evaluation of IR systems, thus foregoing the need for manual assessments altogether. The proposed methods in both areas are commonly evaluated by comparing their performance estimates for a set of systems to a ground truth (provided for instance by evaluating the set of systems according to mean average precision). In contrast, in this poster we compare an automatic system evaluation approach directly to two evaluations based on incomplete manual relevance assessments. For the particular case of TREC's Million Query track, we show that the automatic evaluation leads to results which are highly correlated to those achieved by approaches relying on incomplete manual judgments.
机译:在信息检索(IR)中,旨在降低检索系统评估成本的研究分两方面进行:(i)人工相关性评估数量减少(即不完整)的IR系统评估,以及(ii)红外系统的自动评估,因此完全不需要人工评估。通常通过将两个系统的性能估计值与基本事实进行比较来评估这两个领域中提出的方法(例如,通过根据平均平均精度评估一组系统来提供)。相反,在此海报中,我们将自动系统评估方法直接与基于不完整手动相关性评估的两个评估进行比较。对于TREC的“百万查询”跟踪的特殊情况,我们表明,自动评估所得出的结果与那些依靠不完全人工判断的方法所获得的结果高度相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号