...
首页> 外文期刊>Information retrieval >Offline evaluation options for recommender systems
【24h】

Offline evaluation options for recommender systems

机译:额外的推荐系统的评估选项

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We undertake a detailed examination of the steps that make up offline experiments for recommender system evaluation, including the manner in which the available ratings are filtered and split into training and test; the selection of a subset of the available users for the evaluation; the choice of strategy to handle the background effects that arise when the system is unable to provide scores for some items or users; the use of either full or condensed output lists for the purposes of scoring; scoring methods themselves, including alternative top-weighted mechanisms for condensed rankings; and the application of statistical testing on a weighted-by-user or weighted-by-volume basis as a mechanism for providing confidence in measured outcomes. We carry out experiments that illustrate the impact that each of these choice points can have on the usefulness of an end-to-end system evaluation, and provide examples of possible pitfalls. In particular, we show that varying the split between training and test data, or changing the evaluation metric, or how target items are selected, or how empty recommendations are dealt with, can give rise to comparisons that are vulnerable to misinterpretation, and may lead to different or even opposite outcomes, depending on the exact combination of settings used.
机译:我们对拟议制度评估的离线实验进行了详细审查,包括过滤可用评级并分成培训和测试的方式;选择可用用户的子集进行评估;策略选择在系统无法为某些物品或用户提供分数时出现的背景效果;使用完整或浓缩的输出列表以获得评分的目的;评分方法本身,包括用于凝聚排名的替代顶级加权机制;并将统计测试应用于加权或加权基础作为用于在测量结果中提供置信度的机制。我们执行实验,说明这些选择点中的每一个可以对端到端系统评估的有用性的影响,并提供可能的缺陷的例子。特别地,我们表明,改变培训和测试数据之间的分裂,或改变评估度量,或者如何选择目标项目,或者如何处理空白建议,可能会导致易受误解的比较,并且可能会导致到不同或甚至相反的结果,具体取决于所使用的设置的确切组合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号