Offline evaluation options for recommender systems

Canamares Rocio; Castells Pablo; Moffat Alistair

首页> 外文期刊>Information retrieval >Offline evaluation options for recommender systems

【24h】

Offline evaluation options for recommender systems

机译：额外的推荐系统的评估选项

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We undertake a detailed examination of the steps that make up offline experiments for recommender system evaluation, including the manner in which the available ratings are filtered and split into training and test; the selection of a subset of the available users for the evaluation; the choice of strategy to handle the background effects that arise when the system is unable to provide scores for some items or users; the use of either full or condensed output lists for the purposes of scoring; scoring methods themselves, including alternative top-weighted mechanisms for condensed rankings; and the application of statistical testing on a weighted-by-user or weighted-by-volume basis as a mechanism for providing confidence in measured outcomes. We carry out experiments that illustrate the impact that each of these choice points can have on the usefulness of an end-to-end system evaluation, and provide examples of possible pitfalls. In particular, we show that varying the split between training and test data, or changing the evaluation metric, or how target items are selected, or how empty recommendations are dealt with, can give rise to comparisons that are vulnerable to misinterpretation, and may lead to different or even opposite outcomes, depending on the exact combination of settings used.

机译：我们对拟议制度评估的离线实验进行了详细审查，包括过滤可用评级并分成培训和测试的方式;选择可用用户的子集进行评估;策略选择在系统无法为某些物品或用户提供分数时出现的背景效果;使用完整或浓缩的输出列表以获得评分的目的;评分方法本身，包括用于凝聚排名的替代顶级加权机制;并将统计测试应用于加权或加权基础作为用于在测量结果中提供置信度的机制。我们执行实验，说明这些选择点中的每一个可以对端到端系统评估的有用性的影响，并提供可能的缺陷的例子。特别地，我们表明，改变培训和测试数据之间的分裂，或改变评估度量，或者如何选择目标项目，或者如何处理空白建议，可能会导致易受误解的比较，并且可能会导致到不同或甚至相反的结果，具体取决于所使用的设置的确切组合。

著录项

来源
《Information retrieval》 |2020年第4期|387-410|共24页
作者
Canamares Rocio; Castells Pablo; Moffat Alistair;
展开▼
作者单位

Univ Autonoma Madrid Madrid Spain;

Univ Autonoma Madrid Madrid Spain;

Univ Melbourne Melbourne Vic Australia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Recommender systems; Evaluation; Effectiveness metric; Experimental design;

机译：推荐系统;评估;有效度量;实验设计;

相似文献

外文文献
中文文献
专利

1. Sequeval: An Offline Evaluation Framework for Sequence-Based Recommender Systems [J] . Diego Monti, Enrico Palumbo, Giuseppe Rizzo, Information . 2019,第5期

机译：Sequeval：基于序列的推荐系统的脱机评估框架
2. Offline evaluation of the recommender system initial user interview using estimation of the coverage of content characteristics [J] . Valery Volokha, Ivan Derevitskii Procedia Computer Science . 2021,第a期

机译：使用估算内容特征的覆盖范围的推荐系统初始用户面试的脱机评估
3. Offline optimization for user-specific hybrid recommender systems [J] . Dooms Simon, De Pessemier Toon, Martens Luc Multimedia Tools and Applications . 2015,第9期

机译：特定于用户的混合推荐系统的离线优化
4. A Comparison of Offline Evaluations, Online Evaluations, and User Studies in the Context of Research-Paper Recommender Systems [C] . Joeran Beel, Stefan Langer International conference on theory and practice of digital libraries . 2015

机译：研究论文推荐系统中离线评估，在线评估和用户研究的比较
5. User-centric Design and Evaluation of Online Interactive Recommender Systems [D] . Zhao, Qian. 2018

机译：以用户为中心的在线互动推荐系统设计与评估
6. Is extended pelvic lymph node dissection for prostate cancer the only recommended option? A systematic over-view of the literature [O] . Thomas Rees, Nicholas Raison, Mohammed Iqbal Sheikh, 2016

机译：前列腺癌扩大盆腔淋巴结清扫术是唯一推荐的选择吗？系统的文献综述
7. A Comparative Analysis of Offline and Online Evaluations and Discussion of Research Paper Recommender System Evaluation [O] . Joeran Beel, Stefan Langer, Marcel Genzmehr, 2016

机译：离线与在线评价的比较分析与研究论文推荐系统评价探讨
8. Shipboard Data Multiplex System AN/USQ...(V) T&E (Test and Evaluation) Management Plan. Recommended Test and Evaluation of the SDMS (Shipboard Data Multiple System) Engineering Development Model [R] . Dickinson, J. D. 1979

机译：船载数据多路复用系统aN / UsQ ...（V）T＆E（测试和评估）管理计划。 sDms（船载数据多系统）工程开发模型的推荐测试和评估

Offline evaluation options for recommender systems

摘要

著录项

相似文献

相关主题

期刊订阅