Retrieval System Evaluation: Automatic Evaluation versus Incomplete Judgments

机译：检索系统评估：自动评估与不完整的判断

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In information retrieval (IR), research aiming to reduce the cost of retrieval system evaluations has been conducted along two lines: (i) the evaluation of IR systems with reduced (i.e. incomplete) amounts of manual relevance assessments, and (ii) the fully automatic evaluation of IR systems, thus foregoing the need for manual assessments altogether. The proposed methods in both areas are commonly evaluated by comparing their performance estimates for a set of systems to a ground truth (provided for instance by evaluating the set of systems according to mean average precision). In contrast, in this poster we compare an automatic system evaluation approach directly to two evaluations based on incomplete manual relevance assessments. For the particular case of TREC's Million Query track, we show that the automatic evaluation leads to results which are highly correlated to those achieved by approaches relying on incomplete manual judgments.

机译：在信息检索（IR）中，旨在降低检索系统评估成本的研究分两方面进行：（i）人工相关性评估数量减少（即不完整）的IR系统评估，以及（ii）红外系统的自动评估，因此完全不需要人工评估。通常通过将两个系统的性能估计值与基本事实进行比较来评估这两个领域中提出的方法（例如，通过根据平均平均精度评估一组系统来提供）。相反，在此海报中，我们将自动系统评估方法直接与基于不完整手动相关性评估的两个评估进行比较。对于TREC的“百万查询”跟踪的特殊情况，我们表明，自动评估所得出的结果与那些依靠不完全人工判断的方法所获得的结果高度相关。

著录项

来源
《33rd annual international ACM SIGIR conference on research and development in information retrieval 2010》|2010年|p.863-864|共2页
会议地点 Geneva(CH);Geneva(CH)
作者
Claudia Hauff; Franciska de Jong;
展开▼
作者单位

University of Twente Enschede. The Netherlands;

University of Twente Enschede. The Netherlands;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类信息处理（信息加工）;
关键词
automatic system evaluation;

机译：自动系统评估;

相似文献

外文文献
中文文献
专利

1. Evaluating the effectiveness of information retrieval systems using effort-based relevance judgment [J] . Rajagopal Prabha, Ravana Sri Devi, Koh Yun Sing, Aslib Proceedings . 2019,第1期

机译：使用基于工作量的相关性判断评估信息检索系统的有效性
2. Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review [J] . ParniaSamimi, Sri DeviRavana ScientificWorldJournal . 2014,第3期

机译：通过众包创建信息检索系统评估实验中可靠的相关性判断：综述
3. On the Effectiveness of Evaluating Retrieval Systems in the Absence of Relevance Judgments [J] . Javed A. Aslam, Robert Savell ACM SIGIR FORUM . 2003,第Special期

机译：在缺乏相关性判断的情况下评估检索系统的有效性
4. Retrieval System Evaluation: Automatic Evaluation versus Incomplete Judgments [C] . Claudia Hauff, Franciska de Jong Annual international ACM SIGIR conference on research and development in information retrieval . 2010

机译：检索系统评估：自动评估与不完整的判断
5. An examination into the construction and retrieval of evaluative judgments: A resolution of competing perspectives [D] . Nayakankuppam, Dhananjay 2001

机译：评价性判断的构建和检索研究：竞争观点的解决方案
6. Creation of Reliable Relevance Judgments in Information Retrieval Systems Evaluation Experimentation through Crowdsourcing: A Review [O] . Parnia Samimi, Sri Devi Ravana -1

机译：通过众包在信息检索系统评估实验中建立可靠的相关性评述
7. On the effectiveness of evaluating retrieval systems in the absence of relevance judgments [O] . Javed A. Aslam 2003

机译：在没有相关性判断的情况下评估检索系统的有效性

Retrieval System Evaluation: Automatic Evaluation versus Incomplete Judgments

摘要

著录项

相似文献

相关主题

期刊订阅