首页> 外文学位 >Learning to Improve Recommender Systems.
【24h】

Learning to Improve Recommender Systems.

机译:学习改进推荐系统。

获取原文
获取原文并翻译 | 示例

摘要

With the rapid development of e-commerce websites, music and video streaming websites and social sharing websites, users are facing an explosion of choices nowadays. The presence of unprecedentedly large amount of choices leads to the information overload problem, which refers to the difficulty a user faces in understanding an issue and making decisions that are caused by the presence of too much information. Recommender systems learn users' preferences based on past behaviors and make suggestions for them. These systems are the key component to alleviate and solve the information overload problem. Encouraging progress has been achieved in the research of recommender systems from neighborhood-based methods to model-based methods. However, recommender systems employed today are far from perfect. In this thesis, we propose to improve the recommender systems from four perspectives motivated by real life problems.;First and foremost, we develop online algorithms for collaborative filtering methods, which are widely applicable to recommender systems. Traditionally batch-training algorithms are developed for collaborative filtering methods. They enjoy the advantage of easy to understand and simple to implement. However, the batch-training algorithms fail to consider the dynamic scenario where new users and new items join the system constantly. In order to make recommendations for these new users and on these new items, batch-training algorithms need to re-train the model from scratch. During the training process of batch-training algorithms, all the data have to be processed in each iteration. This is prohibitively slow given the sheer size of users and items faced by a real recommender system. Online learning algorithms can solve both of the problems by updating the model incrementally based on a rating point.;Secondly, we question an assumption made implicitly by most recommender systems. Most existing recommender systems assume that the rating distribution of collected ratings and that of the unobserved ratings are the same. Using data collected from a real life recommender system, we show that this assumption is unlikely to be true. By employing the powerful missing data theory, we develop a model that drops this unrealistic assumption and makes unbiased predictions.;Thirdly we examine the spam problem confronted by recommender systems. The ratings assigned by spam users contaminate the data of a recommender system and lead to deteriorated experience for normal users. We propose to use a reputation estimation system to keep track of users' reputations and identify spam users based on their reputations. We develop a unified framework for reputation estimation that subsumes a number of existing reputation estimation methods. Based on the framework, we also develop a matrix factorization based method that demonstrates outstanding discrimination ability.;Lastly, we integrate content-based filtering with collaborative filtering to alleviate the cold-start problem. The cold-start problem refers to the situation where a system has too little information concerning a user or an item to make accurate recommendations. With the readily available rich information embedded in review comments, which are generally discarded, we can alleviate the cold-start problem. Additionally, we can tag the black box collaborative filtering algorithm with interpretable tags that help a recommender system to provide reasons on why items are being recommended.;In summary, we solve some of the major problems faced by recommender systems and improve them from various perspectives in this thesis. Extensive experiments on real life large-scale datasets confirm the effectiveness and efficiency of proposed models.
机译:随着电子商务网站,音乐和视频流媒体网站以及社交共享网站的快速发展,当今用户面临着众多选择。前所未有的大量选择的存在会导致信息过载问题,这是指用户在理解问题和做出由过多信息的存在引起的决策方面面临的困难。推荐系统根据过去的行为来学习用户的偏好,并为他们提供建议。这些系统是缓解和解决信息过载问题的关键组件。从基于邻域的方法到基于模型的方法,推荐系统的研究取得了令人鼓舞的进展。但是,当今采用的推荐系统远非完美。本文从现实生活中的问题出发,从四个角度对推荐系统进行了改进。首先,我们开发了用于协同过滤方法的在线算法,广泛应用于推荐系统。传统上,批量训练算法是为协作过滤方法开发的。它们具有易于理解和易于实施的优势。但是,批处理训练算法无法考虑动态情况,即新用户和新物品不断加入系统。为了为这些新用户和这些新项目提供建议,批量训练算法需要从头开始重新训练模型。在批量训练算法的训练过程中,必须在每次迭代中处理所有数据。鉴于真正的推荐系统面临的用户和项目庞大,这太慢了。在线学习算法可以通过基于评分点逐步更新模型来解决这两个问题。其次,我们对大多数推荐系统隐含的假设提出了质疑。现有的大多数推荐系统都假定收集的收视率和未观察到的收视率的收视率分布相同。使用从现实生活推荐系统中收集的数据,我们表明此假设不太可能成立。通过使用强大的丢失数据理论,我们开发了一个模型,该模型可以消除这种不切实际的假设并做出无偏的预测。第三,我们研究了推荐系统所面临的垃圾邮件问题。垃圾邮件用户分配的评级污染了推荐系统的数据,并导致普通用户的体验下降。我们建议使用信誉评估系统来跟踪用户的信誉,并根据其信誉识别垃圾邮件用户。我们开发了一个统一的信誉评估框架,其中包含许多现有的信誉评估方法。在此框架的基础上,我们还开发了一种基于矩阵分解的方法,该方法具有出色的判别能力。最后,我们将基于内容的过滤与协作过滤相集成,以缓解冷启动问题。冷启动问题是指系统缺少有关用户或物品的信息而无法做出准确建议的情况。通过将评论中包含的随时可用的丰富信息(通常被丢弃),我们可以缓解冷启动问题。此外,我们可以用可解释的标签标记黑匣子协作过滤算法,以帮助推荐系统提供推荐项目的原因。总之,我们解决了推荐系统面临的一些主要问题,并从各个角度进行了改进在这篇论文中。在现实生活中的大规模数据集上的大量实验证实了所提出模型的有效性和效率。

著录项

  • 作者

    Ling, Guang.;

  • 作者单位

    The Chinese University of Hong Kong (Hong Kong).;

  • 授予单位 The Chinese University of Hong Kong (Hong Kong).;
  • 学科 Computer science.;Computer engineering.;Web studies.;Marketing.
  • 学位 Ph.D.
  • 年度 2015
  • 页码 202 p.
  • 总页数 202
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号