首页> 外文会议>SIAM International Conference on Data Mining >Streaming Tensor Factorization for Infinite Data Sources
【24h】

Streaming Tensor Factorization for Infinite Data Sources

机译:无限数据源流浪费张于张解因素

获取原文

摘要

Sparse tensor factorization is a popular tool in multi-way data analysis and is used in applications such as cybersecurity, recommender systems, and social network analysis. In many of these applications, the tensor is not known a priori and instead arrives in a streaming fashion for a potentially unbounded amount of time. Existing approaches for streaming sparse tensors are not practical for unbounded streaming because they rely on maintaining the full factorization of the data, which grows linearly with time. In this work, we present CP-stream, an algorithm for streaming factorization in the model of the canonical polyadic decomposition which does not grow linearly in time or space, and is thus practical for long-term streaming. Additionally, CP-stream incorporates user-specified constraints such as non-negativity which aid in the stability and interpretability of the factorization. An evaluation of CP-stream demonstrates that it converges faster than state-of-the-art streaming algorithms while achieving lower reconstruction error by an order of magnitude. We also evaluate it on real-world sparse datasets and demonstrate its usability in both network traffic analysis and discussion tracking. Our evaluation uses exclusively public datasets and our source code is released to the public as part of SPLATT, an open source high-performance tensor factorization toolkit.
机译:稀疏的张量因子是多路数据分析中的流行工具,用于网络安全,推荐系统和社交网络分析等应用。在许多这些应用中,张量不知道先验,而是以潜在的无界时间到达流时尚。流稀疏张量的现有方法对于无限的流媒体来说是不实际的,因为它们依赖于维持数据的完整分解,这与时间线性地增长。在这项工作中,我们呈现CP-Stream,一种用于在规范多adic分解模型中流分解的算法,其在时间或空间中不会线性地生长,因此用于长期流式传输。此外,CP-Stream包含用户​​指定的约束,例如非消极性,这有助于分解的稳定性和可解释性。对CP流的评估表明它比最先进的流算法收敛到最终的流算法,同时通过幅度达到更低的重建误差。我们还在真实世界稀疏数据集中评估它,并在网络流量分析和讨论跟踪中展示其可用性。我们的评估专门使用公共数据集,我们的源代码作为SPLATT的一部分发布给公众,这是一种开源高性能张量因子工具包。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号