What Does Affect the Correlation Among Evaluation Measures?

NICOLA FERRO

首页> 外文期刊>ACM Transactions on Information Systems >What Does Affect the Correlation Among Evaluation Measures?

【24h】

What Does Affect the Correlation Among Evaluation Measures?

机译：对评估指标之间的相关性有什么影响？

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Information Retrieval (IR) is well-known for the great number of adopted evaluation measures, with new ones popping up more and more frequently. In this context, correlation analysis is the tool used to study the evaluation measures and to let us understand if two measures rank systems similarly, if they grasp different aspects of system performances or actually reflect different user models, if a new measure is well motivated or not. To this end, the two most commonly used correlation coefficients are the Kendall's r correlation and the AP correlation τap.The goal of the article is to investigate the properties of the tool, that is, correlation analysis, we use to study evaluation measures. In particular, we investigate three research questions about these two correlation coefficients: (ⅰ) what is the effect of the number of systems and topics? (ⅱ) what is the effect of removing low-performing systems? (ⅲ) what is the effect of the experimental collections?To answer these research questions, we propose a methodology based on General Linear Mixed Model (GLMM) and ANalysis Of VAriance (ANOVA) to isolate the effects of the number of topics, number of systems, and experimental collections and to let us observe expected correlation values, net from these effects, which are stable and reliable.We learned that the effect of the number of topics is more prominent than the effect of the number of systems. Even if it produces different absolute values, the effect of removing low-pertorming systems does not seem to provide information substantially different from not removing them, especially when comparing a whole set of evaluation measures. Finally, we found out that both document corpora and topic sets affect the correlation among evaluation measures, the effect of the latter being more prominent. Moreover, there is a substantial interaction between evaluation measures, corpora and topic sets, meaning that the correlation between different evaluation measures can be substantially increased or decreased depending on the different corpora and topics at hand.

机译：信息检索（IR）以大量采用的评估手段而闻名，新的评估手段越来越频繁地出现。在这种情况下，相关性分析是用于研究评估指标的工具，可以让我们了解两个指标是否对系统进行了相似的排名，它们是否掌握了系统性能的不同方面或实际上反映了不同的用户模型，是否有新的动机或不。为此，最常用的两个相关系数是Kendall的r相关性和AP相关性τap。 r n本文的目的是研究工具的属性，即相关性分析，我们将使用它来研究评估措施。特别是，我们针对这两个相关系数研究了三个研究问题：（ⅰ）系统和主题数量的影响是什么？（ⅱ）删除性能不佳的系统有什么作用？（ⅲ）实验集合的影响是什么？ r n为回答这些研究问题，我们提出了一种基于通用线性混合模型（GLMM）和变异分析（ANOVA）的方法，以隔离主题数量的影响，系统数量和实验集合，并让我们观察到预期的相关值，这些效果是稳定且可靠的。 r n我们了解到，主题数量的影响比数量影响更突出系统。即使它产生了不同的绝对值，删除低性能系统的效果似乎也不会提供与不删除它们完全不同的信息，尤其是在比较整套评估方法时。最后，我们发现文档语料库和主题集都影响评估措施之间的相关性，后者的效果更加突出。此外，评估措施，语料库和主题集之间存在实质性的交互作用，这意味着可以根据手头上不同的语料库和主题来大幅增加或减少不同评估方法之间的相关性。

著录项

来源
《ACM Transactions on Information Systems》 |2018年第2期|19.1-19.40|共40页
作者
NICOLA FERRO;
展开▼
作者单位

University of Padua;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Correlation analysis; Kendall's tau correlation; AP correlation; evaluation measures; general linear mixed models (GLMM); analysis of variance (ANOVA); grid of points (GoP);

机译：相关分析;肯德尔的tau相关性;AP关联;评估措施;通用线性混合模型（GLMM）;方差分析（ANOVA）;点格（GoP）;

相似文献

外文文献
中文文献
专利

1. How spurious correlations affect a correlation-based measure of spike timing reliability [J] . Jan A. Freund, Alexander Cerquera Neurocomputing . 2012,第期

机译：伪相关如何影响基于相关的尖峰时序可靠性度量
2. Dependency of illness evaluation on the socialcomparison context: Findings with implicitmeasures of affective evaluation of asthma [J] . Sibylle Petersen, Thomas Ritz British journal of health psychology . 2010,第2期

机译：疾病评估对社会比较背景的依赖性：哮喘情感评估的隐含发现
3. An evaluation of the Acromegaly Treatment Satisfaction Questionnaire (Acro-TSQ) in adult patients with acromegaly, including correlations with other patient-reported outcome measures: data from two large multicenter international studies [J] . Pituitary . 2020,第4期

机译：成人患者患者患者患者患者患者的评价，包括与其他患者报告的结果措施的相关性：来自两个大型多中心国际研究的数据
4. Evaluating instrumental measures of speech quality using Bayesian model selection: Correlations can be misleading! [C] . Antonio Kolossa, Johannes Abel, Tim Fingscheidt IEEE International Conference on Acoustics, Speech and Signal Processing . 2016

机译：使用贝叶斯模型选择评估语音质量的工具度量：相关性可能会误导您！
5. A correlational study of the holistic measure with the index measure of accuracy and complexity in international English -as -a -second -language (ESL) student writings. [D] . Song, Minyung. 2006

机译：国际英语作为第二语言（ESL）的学生作品中的整体性度量与准确性和复杂性指数度量之间的相关性研究。
6. SAFA: A new measure to evaluate psychiatric symptoms detected in a sample of children and adolescents affected by eating disorders. Correlations with risk factors [O] . Emilo Franzoni, Morena Monti, Alessandro Pellicciari, 2009

机译：SAFA：一种评估在饮食失调影响的儿童和青少年样本中发现的精神症状的新措施。与危险因素的关系
7. Dependency of illness evaluation on the social comparison context: findings with implicit measures of affective evaluation of asthma [O] . Petersen, Sibylle, Ritz, Thomas 2009

机译：疾病评估对社会比较背景的依赖性：对哮喘情感评估的隐性测量结果

What Does Affect the Correlation Among Evaluation Measures?

摘要

著录项

相似文献

相关主题

期刊订阅