Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures

机译：多核架构中基于矩阵分解的可扩展任务并行SGD

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recommendation is an indispensable technique especially in e-commerce services such as Amazon or Netflix to provide more preferable items to users. Matrix factorization is a well-known algorithm for recommendation which estimates affinities between users and items solely based on ratings explicitly given by users. To handle the large amounts of data, stochastic gradient descent (SGD), which is an online loss minimization algorithm, can be applied to matrix factorization. SGD is an effective method in terms of both convergence speed and memory consumption, but is difficult to be parallelized due to its essential sequentiality. FPSGD by Zhuang et al. Cite fpsgd is an existing parallel SGD method for matrix factorization by dividing the rating matrix into many small blocks. Threads work on blocks, so that they do not update the same rows or columns of the factor matrices. Because of this technique FPSGD achieves higher convergence speed than other existing methods. Still, as we demonstrate in this paper, FPSGD does not scale beyond 32 cores with 1.4GB Netflix dataset because assigning non-conflicting blocks to threads needs a lock operation. In this work, we propose an alternative approach of SGD for matrix factorization using task parallel programming model. As a result, we have successfully overcome the bottleneck of FPSGD and achieved higher scalability with 64 cores.

机译：推荐是必不可少的技术，尤其是在诸如Amazon或Netflix之类的电子商务服务中，可以为用户提供更可取的商品。矩阵分解是一种众所周知的推荐算法，它仅根据用户明确给出的等级来估计用户与项目之间的亲和力。为了处理大量数据，可以将在线损失最小化算法随机梯度下降（SGD）应用于矩阵分解。就收敛速度和内存消耗而言，SGD是一种有效的方法，但是由于其本质上的顺序性，它很难并行化。庄等人的FPSGD。 Cite fpsgd是用于矩阵分解的现有并行SGD方法，方法是将评分矩阵分成许多小块。线程在块上工作，因此它们不会更新因子矩阵的相同行或列。由于这种技术，FPSGD实现了比其他现有方法更高的收敛速度。但是，正如我们在本文中所展示的，FPSGD不能扩展到具有1.4GB Netflix数据集的32个核心，因为将无冲突的块分配给线程需要锁定操作。在这项工作中，我们提出了一种使用任务并行编程模型进行矩阵分解的SGD方法。结果，我们成功克服了FPSGD的瓶颈，并通过64个内核实现了更高的可扩展性。

著录项

来源
《2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops》|2015年|1178-1184|共7页
会议地点 Hyderabad(IN)
作者
Nishioka Yusuke; Taura Kenjiro;
展开▼
作者单位

Grad. Sch. of Inf. Sci. Technol., Univ. of Tokyo, Tokyo, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Matrix factorization; Recommender systems; Stochastic gradient descent; Task parallel model;

机译：矩阵分解推荐系统随机梯度下降任务并行模型;

相似文献

外文文献
中文文献
专利

1. Modeling power and energy of the task-parallel Cholesky factorization on multicore processors [J] . Pedro Alonso, Manuel F. Dolz, Rafael Mayo, Computer science . 2014,第2期

机译：在多核处理器上模拟任务并行Cholesky分解的能力和能量
2. Assessment of Two Task Frameworks with Dependencies for Matrix Factorizations on a Multicore Architecture [J] . Bylina Jaros?aw Computing and informatics . 2019,第1期

机译：评估依赖于多核架构上矩阵分解的两个任务框架
3. ASSESSMENT OF TWO TASK FRAMEWORKS WITH DEPENDENCIES FOR MATRIX FACTORIZATIONS ON A MULTICORE ARCHITECTURE [J] . Bylina Jaroslaw Computing and informatics . 2019,第1期

机译：在多核架构上评估具有矩阵依赖关系的两个任务框架
4. Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures [C] . Yusuke Nishioka, Kenjiro Taura IEEE International Parallel and Distributed Processing Symposium Workshops . 2015

机译：在多核架构中矩阵分解的可扩展任务平行SGD
5. Performance Optimization for Sparse Matrix Factorization Algorithms on Hybrid Multicore Architectures [D] . Tang, Meng. 2020

机译：混合多核架构上稀疏矩阵分解算法的性能优化
6. Scalable non-negative matrix tri-factorization [O] . Andrej Čopar, Marinka žitnik, Blaž Zupan 2017

机译：可伸缩的非负矩阵三因子分解
7. Modeling power and energy of the task-parallel Cholesky factorization on multicore processors [O] . Alonso Pedro, Dolz Zaragozá Manuel Francisco, Mayo Rafael, 2014

机译：在多核处理器上建模任务并行Cholesky分解的功率和能量

Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅