Test Collections and Evaluation Metrics Based on Graded Relevance

机译：基于分级相关性的测验收集和评估指标

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In modern large information retrieval (IR) environments, the number of documents relevant to a request may easily exceed the number of documents a user is willing to examine. Therefore it is desirable to rank highly relevant documents first in search results. To develop retrieval methods for this purpose requires evaluating retrieval methods accordingly. However, the most IR method evaluations are based on rather liberal and binary relevance assessments. Therefore differences between sloppy and excellent IR methods may not be observed in evaluation. An alternative is to employ graded relevance assessments in evaluation. The present paper discusses graded relevance, test collections providing graded assessments, evaluation metrics based on graded relevance assessments. We shall also examine the effects of using graded relevance assessments in retrieval evaluation, and some evaluation results based on graded relevance. We find that graded relevance provides new insight into IR phenomena and affects the relative merits of IR methods.

机译：在现代的大型信息检索（IR）环境中，与请求相关的文档数量可能会轻易超过用户愿意检查的文档数量。因此，希望将高度相关的文档排在搜索结果的第一位。为此目的开发检索方法需要相应地评估检索方法。但是，大多数IR方法的评估都是基于相当宽松和二进制的相关性评估。因此，在评估中可能不会观察到草率的方法和出色的IR方法之间的差异。另一种方法是在评估中采用分级的相关性评估。本文讨论了分级相关性，提供分级评估的测试集，基于分级相关性评估的评估指标。我们还将研究在检索评估中使用分级相关性评估的效果，以及一些基于分级相关性的评估结果。我们发现，分级相关性为IR现象提供了新的见解，并影响了IR方法的相对优点。

著录项

来源
《Multilingual information access in South Asian languages》|2010年|280-294|共15页
会议地点 Gandhinagar(IN);Bombay(IN)
作者
Kalervo Jaervelin;
展开▼
作者单位

School of Information Sciences, University of Tampere, Finland;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. On the reliability of information retrieval metrics based on graded relevance [J] . Sakai T Information Processing & Management . 2007,第2期

机译：基于分级相关性的信息检索指标的可靠性
2. Evaluation of an In-line Sampling System for the Collection of Raw Milk Samples for Official Testing under the Grade "A" Milk Program [J] . R. E. RINER, L. SINACK, S. GILLETTE, Food Protection Trends . 2007,第4期

机译：评估“ A”级牛奶计划下用于官方检测的原奶样品收集的在线采样系统的评估
3. Using algal metrics and biomass to evaluate multiple ways of defining concentration-based nutrient criteria in streams and their ecological relevance [J] . Nathan J. Smucker, Mary Becker, Naomi E. Detenbeck, Ecological indicators . 2013,第sepa期

机译：使用藻类指标和生物量评估定义河流中基于浓度的营养标准的多种方法及其生态相关性
4. Test Collections and Evaluation Metrics Based on Graded Relevance [C] . Kalervo J?rvelin FIRE 2011 . 2013

机译：基于分级相关性的测试和评估指标
5. The Effect of a Tablet-based Electronic Grading Instrument on Data Collection and Inter-rater Agreement in Airline Simulator Evaluations [D] . Elsenrath, Michael C. 2019

机译：基于片剂的电子分级仪在航空模拟器评估中的数据收集和帧间协议的影响
6. An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection [O] . Borim Ryu, Jinwook Choi 2012

机译：用于建立生物医学测试集合的相关性判断的多个查询表示形式的评估
7. Using Graded-Relevance Metrics for Evaluating Community QA Answer Selection [O] . Tetsuya Sakai, Yohei Seki, Daisuke Ishikawa, 2012

机译：使用分级相关性指标评估社区质量检查答案选择

Test Collections and Evaluation Metrics Based on Graded Relevance

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅