TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

Ballard Grey; Klinvex Alicia; Kolda Tamara G.

首页> 外文期刊>ACM transactions on mathematical software >TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

【24h】

TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

机译：Tuckermpi：通过Tucker Tensor分解，用于大规模数据压缩的并行C ++ / MPI软件包

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software package for compressing distributed data. The approach is based on treating the data as a tensor, i.e., a multidimensional array, and computing its truncated Tucker decomposition, a higher-order analogue to the truncated singular value decomposition of a matrix. The result is a low-rank approximation of the original tensor-structured data. Compression efficiency is achieved by detecting latent global structure within the data, which we contrast to most compression methods that are focused on local structure. In this work, we describe TuckerMPI, our implementation of the truncated Tucker decomposition, including details of the data distribution and in-memory layouts, the parallel and serial implementations of the key kernels, and analysis of the storage, communication, and computational costs. We test the software on 4.5 and 6.7 terabyte data sets distributed across 100 s of nodes (1,000 s of MPI processes), achieving compression ratios between 100 and 200,000x, which equates to 99-99.999% compression (depending on the desired accuracy) in substantially less time than it would take to even read the same dataset from a parallel file system. Moreover, we show that our method also allows for reconstruction of partial or down-sampled data on a single node, without a parallel computer so long as the reconstructed portion is small enough to fit on a single machine, e.g., in the instance of reconstructing/visualizing a single down-sampled time step or computing summary statistics. The code is available at https://gitlab.com/tensors/TuckerMPI.

机译：我们的目标是压缩大规模级栅栏结构数据，例如高保真计算模拟的多raByte输出。对于此类数据集，我们开发了一个名为Tuckermpi的新软件包，一个用于压缩分布式数据的并行C ++ / MPI软件包。该方法基于将数据处理为张量，即，多维阵列，以及计算其截断的Tucker分解，到矩阵的截断奇异值分解的高阶模拟。结果是原始张量结构数据的低级近似。通过检测数据内的潜在全局结构来实现压缩效率，我们与专注于局部结构的大多数压缩方法形成鲜明对比。在这项工作中，我们描述了Tuckermpi，我们的截断Tucker分解的实现，包括数据分发和内存布局的细节，密钥内核的并行和串行实现，以及分析存储，通信和计算成本。我们在4.5和6.7 TB的数据集上测试软件，其分布在100秒的节点（MPI过程的1,000秒），实现了100到200,000倍之间的压缩比，这相当于99-99.999％的压缩（取决于所需的精度）甚至需要从并行文件系统读取相同数据集的时间大量更少。此外，我们表明，我们的方法还允许在单个节点上重建单个节点上的部分或下采样数据，只要重建部分足够小以便在单个机器上适合，例如，在重构的实例中/可视化单个下采样时间步或计算摘要统计信息。代码可在https://gitlab.com/tensors/tuckermpi上获得。

著录项

来源
《ACM transactions on mathematical software》 |2020年第2期|13.1-13.31|共31页
作者
Ballard Grey; Klinvex Alicia; Kolda Tamara G.;
展开▼
作者单位

Wake Forest Univ Dept Comp Sci Winston Salem NC 27109 USA;

Sandia Natl Labs Livermore CA 94551 USA;

Sandia Natl Labs Livermore CA 94551 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Tucker decomposition; tensor decomposition; higher-order singular value decomposition (HOSVD);

机译：Tucker分解;张量分解;高阶奇异值分解（Hosvd）;

相似文献

外文文献
中文文献
专利

1. NORMAN MATLOFF . Parallel Computing for Data Science: With Examples in R, C++, and CUDA . Boca Raton : CRC Press . NORMAN MATLOFF NORMAN MATLOFF . Parallel Computing for Data Science: With Examples in R, C++, and CUDA Parallel Computing for Data Science: With Examples in R, C++, and CUDA . Boca Raton Boca Raton : CRC Press CRC Press . [J] . Eddelbuettel Dirk Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2018,第2期

机译：Norman Matloff。数据科学的并行计算：在R，C ++和CUDA中使用示例。 Boca Raton：CRC压力机。 Norman Matloff Norman Matloff。数据科学的并行计算：使用R，C ++和CUDA的示例进行数据科学：R，C ++和CUDA中的示例。 Boca Raton Boca Raton：CRC按CRC压力机。
2. NORMAN MATLOFF . Parallel Computing for Data Science: With Examples in R, C++, and CUDA . Boca Raton : CRC Press . NORMAN MATLOFF NORMAN MATLOFF . Parallel Computing for Data Science: With Examples in R, C++, and CUDA Parallel Computing for Data Science: With Examples in R, C++, and CUDA . Boca Raton Boca Raton : CRC Press CRC Press . [J] . Eddelbuettel Dirk Biometrics: Journal of the Biometric Society : An International Society Devoted to the Mathematical and Statistical Aspects of Biology . 2018,第2期

机译：诺曼马特洛夫。数据科学的并行计算：r，c ++和cuda中的例子。 Boca Raton：CRC压力机。 Norman Matloff Norman Matloff。数据科学并行计算：使用R，C ++和CUDA的示例进行数据科学：在R，C ++和CUDA中使用示例。 Boca Raton Boca Raton：CRC按CRC压力机。
3. A tensor compression algorithm using Tucker decomposition and dictionary dimensionality reduction [J] . Chenquan Gan, Junwei Mao, Zufan Zhang, International Journal of Distributed Sensor Networks . 2020,第4期

机译：使用Tucker分解的张量压缩算法和字典维数减少
4. A novel tensor-based model compression method via tucker and tensor train decompositions [C] . Cong Chen, Kim Batselier, Ngai Wong IEEE Conference on Electrical Performance of Electronic Packaging and Systems . 2017

机译：基于塔克和张量列分解的基于张量的新型模型压缩方法
5. Parallel MPI/FORTRAN finite element symmetrical/unsymmetrical Domain Decomposition [D] . Tungkahotara, Siroj 2008

机译：并行MPI / FORTRAN有限元对称/不对称域分解
6. Drug candidate identification based on gene expression of treated cells using tensor decomposition-based unsupervised feature extraction for large-scale data [O] . Y-h. Taguchi 2019

机译：基于基于张量分解的无预测特征提取对大规模数据的处理细胞基因表达的药物候选鉴定
7. Historical Multi-Station SCADA Data Compression of Distribution Management System Based on Tensor Tucker Decomposition [O] . Hongshan Zhao, Libo Ma, Xihui Yan, 2019

机译：基于张量塔克分解的分销管理系统历史多站SCADA数据压缩

TuckerMPI: A Parallel C++/MPI Software Package for Large-scale Data Compression via the Tucker Tensor Decomposition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅