首页> 外文期刊>ACM transactions on mathematical software >TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition
【24h】

TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

机译:Tuckermpi:通过Tucker Tensor分解,用于大规模数据压缩的并行C ++ / MPI软件包

获取原文
获取原文并翻译 | 示例

摘要

Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software package for compressing distributed data. The approach is based on treating the data as a tensor, i.e., a multidimensional array, and computing its truncated Tucker decomposition, a higher-order analogue to the truncated singular value decomposition of a matrix. The result is a low-rank approximation of the original tensor-structured data. Compression efficiency is achieved by detecting latent global structure within the data, which we contrast to most compression methods that are focused on local structure. In this work, we describe TuckerMPI, our implementation of the truncated Tucker decomposition, including details of the data distribution and in-memory layouts, the parallel and serial implementations of the key kernels, and analysis of the storage, communication, and computational costs. We test the software on 4.5 and 6.7 terabyte data sets distributed across 100 s of nodes (1,000 s of MPI processes), achieving compression ratios between 100 and 200,000x, which equates to 99-99.999% compression (depending on the desired accuracy) in substantially less time than it would take to even read the same dataset from a parallel file system. Moreover, we show that our method also allows for reconstruction of partial or down-sampled data on a single node, without a parallel computer so long as the reconstructed portion is small enough to fit on a single machine, e.g., in the instance of reconstructing/visualizing a single down-sampled time step or computing summary statistics. The code is available at https://gitlab.com/tensors/TuckerMPI.
机译:我们的目标是压缩大规模级栅栏结构数据,例如高保真计算模拟的多raByte输出。对于此类数据集,我们开发了一个名为Tuckermpi的新软件包,一个用于压缩分布式数据的并行C ++ / MPI软件包。该方法基于将数据处理为张量,即,多维阵列,以及计算其截断的Tucker分解,到矩阵的截断奇异值分解的高阶模拟。结果是原始张量结构数据的低级近似。通过检测数据内的潜在全局结构来实现压缩效率,我们与专注于局部结构的大多数压缩方法形成鲜明对比。在这项工作中,我们描述了Tuckermpi,我们的截断Tucker分解的实现,包括数据分发和内存布局的细节,密钥内核的并行和串行实现,以及分析存储,通信和计算成本。我们在4.5和6.7 TB的数据集上测试软件,其分布在100秒的节点(MPI过程的1,000秒),实现了100到200,000倍之间的压缩比,这相当于99-99.999%的压缩(取决于所需的精度)甚至需要从并行文件系统读取相同数据集的时间大量更少。此外,我们表明,我们的方法还允许在单个节点上重建单个节点上的部分或下采样数据,只要重建部分足够小以便在单个机器上适合,例如,在重构的实例中/可视化单个下采样时间步或计算摘要统计信息。代码可在https://gitlab.com/tensors/tuckermpi上获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号