首页> 外文期刊>ACM transactions on knowledge discovery from data >ParCube: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition
【24h】

ParCube: Sparse Parallelizable CANDECOMP-PARAFAC Tensor Decomposition

机译:ParCube:稀疏并行的CANDECOMP-PARAFAC张量分解

获取原文
获取原文并翻译 | 示例

摘要

How can we efficiently decompose a tensor into sparse factors, when the data do not fit in memory? Tensor decompositions have gained a steadily increasing popularity in data-mining applications; however, the current state-of-art decomposition algorithms operate on main memory and do not scale to truly large datasets. In this work, we propose PaRCube, a new and highly parallelizable method for speeding up tensor decompositions that is well suited to produce sparse approximations. Experiments with even moderately large data indicate over 90% sparser outputs and 14 times faster execution, with approximation error close to the current state of the art irrespective of computation and memory requirements. We provide theoretical guarantees for the algorithm's correctness and we experimentally validate our claims through extensive experiments, including four different real world datasets (Enron, Lbnl, Facebook and Nell), demonstrating its effectiveness for data-mining practitioners. In particular, we are the first to analyze the very large Nell dataset using a sparse tensor decomposition, demonstrating that ParCube enables us to handle effectively and efficiently very large datasets. Finally, we make our highly scalable parallel implementation publicly available, enabling reproducibility of our work.
机译:当数据不适合内存时,如何有效地将张量分解为稀疏因子? Tensor分解在数据挖掘应用程序中已逐渐稳定地普及。但是,当前最新的分解算法在主内存上运行,无法扩展到真正的大型数据集。在这项工作中,我们提出了PaRCube,这是一种新的且高度可并行化的方法,用于加快张量分解,非常适合于产生稀疏近似。即使数据量适中的实验也表明,稀疏输出超过90%,执行速度提高14倍,其近似误差与当前技术水平相当,而与计算和内存要求无关。我们为算法的正确性提供了理论上的保证,并且我们通过广泛的实验(包括四个不同的现实世界数据集(Enron,Lbnl,Facebook和Nell))通过实验验证了我们的主张,从而证明了其对数据挖掘从业者的有效性。特别是,我们是第一个使用稀疏张量分解分析非常大的Nell数据集的人,这表明ParCube使我们能够有效地处理非常大的数据集。最后,我们将高度可扩展的并行实现公开发布,从而实现工作的可重复性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号