首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree
【24h】

Efficient parallel CP decomposition with pairwise perturbation and multi-sweep dimension tree

机译:高效并行CP分解与成对扰动和多扫尺寸树

获取原文

摘要

The widely used alternating least squares (ALS) algorithm for the canonical polyadic (CP) tensor decomposition is dominated in cost by the matricized-tensor times Khatri-Rao product (MTTKRP) kernel. This kernel is necessary to set up the quadratic optimization subproblems. State-of-the-art parallel ALS implementations use dimension trees to avoid redundant computations across MTTKRPs within each ALS sweep. In this paper, we propose two new parallel algorithms to accelerate CP-ALS. We introduce the multi-sweep dimension tree (MSDT) algorithm, which requires the contraction between an order N input tensor and the first-contracted input matrix once every $(N-1)/N$ sweeps. This algorithm reduces the leading order computational cost by a factor of $2(N-1)/N$ relative to the best previously known approach. In addition, we introduce a more communication-efficient approach to parallelizing an approximate CP-ALS algorithm, pairwise perturbation. This technique uses perturbative corrections to the subproblems rather than recomputing the contractions, and asymptotically accelerates ALS. Our benchmark results on 1024 processors on the Stampede2 supercomputer show that CP decomposition obtains a 1.25X speed-up from MSDT and a 1.94X speedup from pairwise perturbation compared to the state-of-the-art dimension-tree based CP-ALS implementations.
机译:广泛使用的交替最小二乘(Als)算法用于规范多差异(CP)张量分解,由Matricized-Tensor Times Khatri-Rao产品(MTTKRP)内核成本支配。此内核是设置二次优化子问题的必要条件。最先进的并行ALS实现使用尺寸树,以避免每个ALS扫描内的MTTKRP跨越冗余计算。在本文中,我们提出了两个新的并行算法来加速CP-ALS。我们介绍了多扫描尺寸树(MSDT)算法,需要每次$(n-1)/ n $ sweeps一次性N个输入张量和第一签约输入矩阵之间的收缩。该算法将领先的订单计算成本降低了2倍(n-1)/ n $相对于最好的先前已知的方法。此外,我们介绍了一种更通信的高效方法来并行化近似CP-ALS算法,成对扰动。该技术使用扰动校正到子问题,而不是重新计算收缩,并渐近加速ALS。我们的基准导致StampEde2上的1024个处理器上的结果表明,与基于最先进的维度树的CP-ALS实现相比,CP分解从MSDT和1.94倍加速度获得1.25倍的加速。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号