首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Clustering Based Online Learning in Recommender Systems: A Bandit Approach
【24h】

Clustering Based Online Learning in Recommender Systems: A Bandit Approach

机译:基于群集的在线学习在推荐系统中:强盗方法

获取原文

摘要

A big challenge for the design and implementation of large-scale online services is determining what items to recommend to their users. For instance, Netflix makes movie recommendations; Amazon makes product recommendations; and Yahoo! makes webpage recommendations. In these systems, items are recommended based on the characteristics and circumstances of the users, which are provided to the recommender as contexts (e.g., search history, time, and location). The task of building an efficient recommender system is challenging due to the fact that both the item space and the context space are very large. Existing works either focus on a large item space without contexts, large context space with small number of items, or they jointly consider the space of items and contexts together to solve the online recommendation problem. In contrast, we develop an algorithm that does exploration and exploitation in the context space and the item space separately, and develop an algorithm that combines clustering of the items with information aggregation in the context space. Basically, given a user's context, our algorithm aggregates its past history over a ball centered on the user's context, whose radius decreases at a rate that allows sufficiently accurate estimates of the payoffs such that the recommended payoffs converge to the true (unknown) payoffs. Theoretical results show that our algorithm can achieve a sublinear learning regret in time, namely the payoff difference of the oracle optimal benchmark, where the preferences of users on certain items in certain context are known, and our algorithm, where the information is incomplete. Numerical results show that our algorithm significantly outperforms (over 48%) the existing algorithms in terms of regret.
机译:对于大型网络服务的设计和实现的一大挑战是确定哪些项目推荐给他们的用户。例如,Netflix的电影做的建议;亚马逊使得产品的建议;和Yahoo!使得网页的建议。在这些系统中,项目基于的特点和用户,这是提供给推荐为背景的情况下,推荐的(例如,搜索历史,时间和地点)。建立有效的推荐系统的任务是具有挑战性,因为事实上,这两个项目的空间和范围内的空间都非常大。现有的作品无论是专注于一个大的项目空间,而不背景,与少数项目的大背景下的空间,或者他们共同审议项目和环境的空间,共同解决网上推荐问题。与此相反,我们开发了一种算法,分别做的背景下,空间探索和利用和项目空间,并开发一种算法,结合聚类与上下文中的空间信息聚合项目。基本上,给用户的情况下,我们的算法聚集其过去的历史上集中在用户的上下文中一球,其半径以一定的速率,允许降低足够准确的支付这样的估计,建议的收益收敛到真(未知)的回报。理论结果表明,该算法可以实现时间上的次线性学习的遗憾,即甲骨文最佳标杆,即用户对某些项目在某些方面的喜好是已知的收益差异,我们的算法,这里的信息是不完整的。计算结果表明,我们的算法显著优于(超过48%)的现有算法的遗憾条款。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号