首页> 外文期刊>Intelligent data analysis >Real-time incremental recommendation for streaming data based on apache flink
【24h】

Real-time incremental recommendation for streaming data based on apache flink

机译:基于Apache Flink的流媒体数据的实时增量推荐

获取原文
获取原文并翻译 | 示例
           

摘要

Collaborative filtering (CF), one of the most famous methods for building recommendation systems, recommends relevant items to users or predicting ratings of users' unknown items. Matrix factorization (MF) models are well-known model to deal with predicting the rating problem. However, the recommendation system based on matrix factorization is hard to keep up with the rapidly changing real-world data. When ratings on new users or new items come, the static model can not fit well on new data. As a consequence, if the current thing does not apply, the prediction accuracy will lose. In addition, it is a significant computation cost to rebuild the model on the whole data. To capture these changes, in this paper, we construct an online-and-offline Collaborative Filtering with a multi-method model to improve the traditional CF method, called Online SGD with Offline Knowledge (OSGDO for short). Besides, we propose a real-time incremental recommendation framework on Apache Flink, which is a scalable stream and batch data processing platform. Meanwhile, we implement our proposed method on our proposed framework. Our method proves to be good at online training when new observations arrive. And the results of experiments show that the dynamic training process we proposed is more efficient than rebuild the model on all the data. At the same time, our algorithm performs well in practice and can achieve impressive accuracy quickly when it is tested with the well-known data sets of MoviesLens and Netflix.
机译:协作过滤(CF)是建立推荐系统的最着名方法之一,推荐给用户的相关项目或预测用户未知物品的评级。矩阵分解(MF)模型是众所周知的模型,可以处理预测评级问题。然而,基于矩阵分解的推荐系统很难跟上快速改变的现实数据。当对新用户或新项目的评级来看,静态模型无法符合新数据。因此,如果目前的事情不适用,则预测精度将失去。此外,在整个数据上重建模型是一种重要的计算成本。为了捕获这些更改,在本文中,我们通过多方法模型构建一个在线和离线协作过滤,以改善传统的CF方法,称为在线SGD,具有离线知识(简称OSGDO)。此外,我们提出了Apache Flink上的实时增量推荐框架,这是一个可扩展的流和批量数据处理平台。同时,我们在拟议的框架上实施我们提出的方法。当新的观察到达时,我们的方法擅长在线培训。实验结果表明,我们提出的动态训练过程比在所有数据上重建模型更有效。与此同时,我们的算法在实践中表现良好,并且可以在用众所周知的数据集和Netflix测试时快速实现令人印象深刻的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号