首页> 外文期刊>Expert Systems with Application >An empirical study on the effect of data sparsity and data overlap on cross domain collaborative filtering performance
【24h】

An empirical study on the effect of data sparsity and data overlap on cross domain collaborative filtering performance

机译:数据稀疏性和数据重叠对跨域协作过滤性能影响的实证研究

获取原文
获取原文并翻译 | 示例
           

摘要

In the present day, the oversaturation of data has complicated the process of finding information from a data source. Recommender systems aim to alleviate this problem in various domains by actively suggesting selective information to potential users based on their personal preferences. Amongst these approaches, collaborative filtering based recommenders (CF recommenders), which make use of users' implicit and explicit ratings for items, are widely regarded as the most successful type of recommender system. However, CF recommenders are sensitive to issues caused by data sparsity, where users rate very few items, or items receive very few ratings from users, meaning there is not enough data to give a recommendation. The majority of studies have attempted to solve these issues by focusing on developing new algorithms within a single domain. Recently, cross-domain recommenders that use multiple domain datasets have attracted increasing attention amongst the research community. Cross-domain recommenders assume that users who express their preferences in one domain (called the target domain) will also express their preferences in another domain (called the source domain), and that these additional preferences will improve precision and recall of recommendations to the user. The purpose of this study is to investigate the effects of various data sparsity and data overlap issues on the performance of cross-domain CF recommenders, using various aggregation functions. In this study, several different cross domain recommenders were created by collecting three datasets from three separate domains of a large Korean fashion company and combining them with different algorithms and different aggregation approaches. The cross-recommenders that used high performance, high overlap domains showed significant improvement of precision and recall of recommendation when the recommendation scores of individual domains were combined using the summation aggregation function. However, the cross-recommenders that used low performance, low overlap domains showed little or no performance improvement in all areas. This result implies that the use of cross-domain recommenders do not guarantee performance improvement, rather that it is necessary to consider relevant factors carefully to achieve performance improvement when using cross-domain recommenders. (C) 2017 Elsevier Ltd. All rights reserved.
机译:如今,数据的过饱和使从数据源查找信息的过程变得复杂。推荐系统旨在通过根据潜在用户的个人喜好向潜在用户主动建议选择信息,从而缓解各个领域的问题。在这些方法中,基于协作过滤的推荐器(CF推荐器)利用了用户对项目的隐式和显式评级,被广泛认为是最成功的推荐器类型。但是,CF推荐器对由数据稀疏性引起的问题很敏感,在这些稀疏性中,用户对非常少的项目进行评分,或者项目对用户的评价很少,这意味着没有足够的数据来推荐。大多数研究都试图通过专注于在单个领域内开发新算法来解决这些问题。最近,使用多个域数据集的跨域推荐者已引起研究社区的越来越多的关注。跨域推荐者认为,在一个域(称为目标域)中表达其偏好的用户也将在另一个域(称为源域)中表达其偏好,并且这些附加的偏好将提高用户的推荐准确性和召回率。本研究的目的是使用各种聚合函数来研究各种数据稀疏性和数据重叠问题对跨域CF推荐器性能的影响。在这项研究中,通过从一家韩国大型时装公司的三个不同领域收集三个数据集,并将它们与不同的算法和不同的汇总方法相结合,创建了几个不同的跨域推荐器。当使用求和聚集函数组合单个域的推荐分数时,使用高性能,高重叠域的交叉推荐者在准确性和推荐回忆方面显示出显着提高。但是,使用低性能,低重叠域的交叉推荐器在所有方面都没有或几乎没有性能改善。该结果表明,使用跨域推荐器不能保证性能提高,而是在使用跨域推荐器时必须仔细考虑相关因素以实现性能提高。 (C)2017 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号