首页> 外文期刊>Information retrieval >Statistical biases in Information Retrieval metrics for recommender systems
【24h】

Statistical biases in Information Retrieval metrics for recommender systems

机译:推荐系统的信息检索指标中的统计偏差

获取原文
获取原文并翻译 | 示例
       

摘要

There is an increasing consensus in the Recommender Systems community that the dominant error-based evaluation metrics are insufficient, and mostly inadequate, to properly assess the practical effectiveness of recommendations. Seeking to evaluate recommendation rankings-which largely determine the effective accuracy in matching user needs-rather than predicted rating values, Information Retrieval metrics have started to be applied for the evaluation of recommender systems. In this paper we analyse the main issues and potential divergences in the application of Information Retrieval methodologies to recommender system evaluation, and provide a systematic characterisation of experimental design alternatives for this adaptation. We lay out an experimental configuration framework upon which we identify and analyse specific statistical biases arising in the adaptation of Information Retrieval metrics to recommendation tasks, namely sparsity and popularity biases. These biases considerably distort the empirical measurements, hindering the interpretation and comparison of results across experiments. We develop a formal characterisation and analysis of the biases upon which we analyse their causes and main factors, as well as their impact on evaluation metrics under different experimental configurations, illustrating the theoretical findings with empirical evidence. We propose two experimental design approaches that effectively neutralise such biases to a large extent. We report experiments validating our proposed experimental variants, and comparing them to alternative approaches and metrics that have been defined in the literature with similar or related purposes.
机译:推荐系统社区中越来越多的共识是,基于错误的主要评估指标不足以且不足以正确评估建议的实际有效性。为了评估推荐等级(这在很大程度上决定了满足用户需求的有效准确性),而不是预测的等级值,信息检索指标已开始应用于推荐系统的评估。在本文中,我们分析了信息检索方法论在推荐系统评估中的主要问题和潜在的分歧,并为适应性实验提供了实验设计替代方案的系统表征。我们提出了一个实验性配置框架,在此框架上,我们可以识别和分析在将信息检索指标适应推荐任务时出现的特定统计偏差,即稀疏性和受欢迎度偏差。这些偏差极大地扭曲了经验测量结果,从而阻碍了整个实验结果的解释和比较。我们对偏差进行了正式的表征和分析,以此来分析其成因和主要因素,以及它们在不同实验配置下对评估指标的影响,以经验证据说明理论发现。我们提出了两种实验设计方法,可以在很大程度上有效抵消这种偏差。我们报告了一些实验,这些实验验证了我们提出的实验变体,并将它们与文献中出于相似或相关目的定义的替代方法和指标进行了比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号