首页> 外文会议>2015 IEEE 29th International Parallel and Distributed Processing Symposium Workshops >Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures
【24h】

Scalable Task-Parallel SGD on Matrix Factorization in Multicore Architectures

机译:多核架构中基于矩阵分解的可扩展任务并行SGD

获取原文
获取原文并翻译 | 示例

摘要

Recommendation is an indispensable technique especially in e-commerce services such as Amazon or Netflix to provide more preferable items to users. Matrix factorization is a well-known algorithm for recommendation which estimates affinities between users and items solely based on ratings explicitly given by users. To handle the large amounts of data, stochastic gradient descent (SGD), which is an online loss minimization algorithm, can be applied to matrix factorization. SGD is an effective method in terms of both convergence speed and memory consumption, but is difficult to be parallelized due to its essential sequentiality. FPSGD by Zhuang et al. Cite fpsgd is an existing parallel SGD method for matrix factorization by dividing the rating matrix into many small blocks. Threads work on blocks, so that they do not update the same rows or columns of the factor matrices. Because of this technique FPSGD achieves higher convergence speed than other existing methods. Still, as we demonstrate in this paper, FPSGD does not scale beyond 32 cores with 1.4GB Netflix dataset because assigning non-conflicting blocks to threads needs a lock operation. In this work, we propose an alternative approach of SGD for matrix factorization using task parallel programming model. As a result, we have successfully overcome the bottleneck of FPSGD and achieved higher scalability with 64 cores.
机译:推荐是必不可少的技术,尤其是在诸如Amazon或Netflix之类的电子商务服务中,可以为用户提供更可取的商品。矩阵分解是一种众所周知的推荐算法,它仅根据用户明确给出的等级来估计用户与项目之间的亲和力。为了处理大量数据,可以将在线损失最小化算法随机梯度下降(SGD)应用于矩阵分解。就收敛速度和内存消耗而言,SGD是一种有效的方法,但是由于其本质上的顺序性,它很难并行化。庄等人的FPSGD。 Cite fpsgd是用于矩阵分解的现有并行SGD方法,方法是将评分矩阵分成许多小块。线程在块上工作,因此它们不会更新因子矩阵的相同行或列。由于这种技术,FPSGD实现了比其他现有方法更高的收敛速度。但是,正如我们在本文中所展示的,FPSGD不能扩展到具有1.4GB Netflix数据集的32个核心,因为将无冲突的块分配给线程需要锁定操作。在这项工作中,我们提出了一种使用任务并行编程模型进行矩阵分解的SGD方法。结果,我们成功克服了FPSGD的瓶颈,并通过64个内核实现了更高的可扩展性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号