Real-time incremental recommendation for streaming data based on apache flink

Tang Zhuo; Liu Zeyu; Li Kenli; Li Keqin

首页> 外文期刊>Intelligent data analysis >Real-time incremental recommendation for streaming data based on apache flink

【24h】

Real-time incremental recommendation for streaming data based on apache flink

机译：基于Apache Flink的流媒体数据的实时增量推荐

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Collaborative filtering (CF), one of the most famous methods for building recommendation systems, recommends relevant items to users or predicting ratings of users' unknown items. Matrix factorization (MF) models are well-known model to deal with predicting the rating problem. However, the recommendation system based on matrix factorization is hard to keep up with the rapidly changing real-world data. When ratings on new users or new items come, the static model can not fit well on new data. As a consequence, if the current thing does not apply, the prediction accuracy will lose. In addition, it is a significant computation cost to rebuild the model on the whole data. To capture these changes, in this paper, we construct an online-and-offline Collaborative Filtering with a multi-method model to improve the traditional CF method, called Online SGD with Offline Knowledge (OSGDO for short). Besides, we propose a real-time incremental recommendation framework on Apache Flink, which is a scalable stream and batch data processing platform. Meanwhile, we implement our proposed method on our proposed framework. Our method proves to be good at online training when new observations arrive. And the results of experiments show that the dynamic training process we proposed is more efficient than rebuild the model on all the data. At the same time, our algorithm performs well in practice and can achieve impressive accuracy quickly when it is tested with the well-known data sets of MoviesLens and Netflix.

机译：协作过滤（CF）是建立推荐系统的最着名方法之一，推荐给用户的相关项目或预测用户未知物品的评级。矩阵分解（MF）模型是众所周知的模型，可以处理预测评级问题。然而，基于矩阵分解的推荐系统很难跟上快速改变的现实数据。当对新用户或新项目的评级来看，静态模型无法符合新数据。因此，如果目前的事情不适用，则预测精度将失去。此外，在整个数据上重建模型是一种重要的计算成本。为了捕获这些更改，在本文中，我们通过多方法模型构建一个在线和离线协作过滤，以改善传统的CF方法，称为在线SGD，具有离线知识（简称OSGDO）。此外，我们提出了Apache Flink上的实时增量推荐框架，这是一个可扩展的流和批量数据处理平台。同时，我们在拟议的框架上实施我们提出的方法。当新的观察到达时，我们的方法擅长在线培训。实验结果表明，我们提出的动态训练过程比在所有数据上重建模型更有效。与此同时，我们的算法在实践中表现良好，并且可以在用众所周知的数据集和Netflix测试时快速实现令人印象深刻的准确性。

著录项

来源
《Intelligent data analysis》 |2019年第6期|1421-1437|共17页
作者
Tang Zhuo; Liu Zeyu; Li Kenli; Li Keqin;
展开▼
作者单位

Hunan Univ Coll Comp Sci & Elect Engn Changsha 410082 Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha 410082 Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha 410082 Hunan Peoples R China;

Hunan Univ Coll Comp Sci & Elect Engn Changsha 410082 Hunan Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Collaboratie filtering; online learning; incremental learning; recommendation system; low-rank matrix factorization; Apache Flink;

机译：Collaboratie过滤;在线学习;增量学习;推荐系统;低秩矩阵分解;Apache Flink;

相似文献

外文文献
中文文献
专利

1. Real-Time Analysis of Vital Signs Using Incremental Data Stream Mining Techniques with a Case Study of ARDS Under ICU Treatment [J] . Fong Simon, Siu Shirley W. I., Zhou Suzy, Journal of Medical Imaging and Health Informatics . 2015,第5期

机译：使用增量数据流挖掘技术对生命体征进行实时分析，以ICU治疗下的ARDS为例
2. RDMA-Based Apache Storm for High-Performance Stream Data Processing [J] . Ziyu Zhang, Zitan Liu, Qingcai Jiang, International journal of parallel programming . 2021,第5期

机译：基于RDMA的Apache Storm用于高性能流数据处理
3. Efficient Streaming Mass Spatio-Temporal Vehicle Data Access in Urban Sensor Networks Based on Apache Storm [J] . Lianjie Zhou, Nengcheng Chen, Zeqiang Chen Sensors . 2017,第4期

机译：基于Apache Storm的城市传感器网络中高效流式时空车辆时空数据访问
4. The Tentative Research of Hydrological IoT Data Processing System Based on Apache Flink [C] . Feng Ye, Peng Zhang, Cheng Hu, International Conference on Service-Oriented Computing . 2019

机译：基于Apache Flink的水文IOT数据处理系统暂定研究
5. Development of a Standardized Framework for Cost-Effective Communication System Based on 3D Data Streaming and Real-Time 3D Reconstruction. [D] . Huynh, Dang Duong Hai. 2017

机译：基于3D数据流和实时3D重构的经济高效通信系统标准化框架的开发。
6. Efficient Streaming Mass Spatio-Temporal Vehicle Data Access in Urban Sensor Networks Based on Apache Storm [O] . Lianjie Zhou, Nengcheng Chen, Zeqiang Chen 2017

机译：基于Apache Storm的城市传感器网络中高效流式传输时空车辆数据访问
7. CPiX: Real-Time Analytics Over Out-of-Order Data Streams By Incremental Sliding-Window Aggregation [O] . Savong Bou, Hiroyuki Kitagawa, Toshiyuki Amagasa 2021

机译：CPIX：通过增量滑动窗口聚合通过订阅无序数据流的实时分析

Real-time incremental recommendation for streaming data based on apache flink

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅