首页> 外文会议>IEEE international conference on data engineering >HaTen2: Billion-scale tensor decompositions
【24h】

HaTen2: Billion-scale tensor decompositions

机译:Haten2:亿尺度张量分解

获取原文

摘要

How can we find useful patterns and anomalies in large scale real-world data with multiple attributes? For example, network intrusion logs, with (source-ip, target-ip, port-number, timestamp)? Tensors are suitable for modeling these multi-dimensional data, and widely used for the analysis of social networks, web data, network traffic, and in many other settings. However, current tensor decomposition methods do not scale for tensors with millions and billions of rows, columns and `fibers', that often appear in real datasets. In this paper, we propose HaTen2, a scalable distributed suite of tensor decomposition algorithms running on the MapReduce platform. By carefully reordering the operations, and exploiting the sparsity of real world tensors, HaTen2 dramatically reduces the intermediate data, and the number of jobs. As a result, using HaTen2, we analyze big real-world tensors that can not be handled by the current state of the art, and discover hidden concepts.
机译:我们如何在具有多个属性的大规模现实数据中找到有用的模式和异常?例如,网络入侵日志,具有(源IP,目标IP,端口号,时间戳)?张量适用于建模这些多维数据,并广泛用于分析社交网络,网络数据,网络流量以及许多其他设置。然而,当前的张量分解方法对张量的张量没有数百万和数十亿行,列和“光纤”,它们通常出现在真实数据集中。在本文中,我们提出了在MapReduce平台上运行的缩放分布式算法的Haten2。通过仔细重新排序运营,利用现实世界张量的稀疏性,Haten2大大减少了中间数据和作业的数量。结果,使用HATEN2,我们分析了无法由现有技术状态处理的大型现实世界的张量,并发现隐藏的概念。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号