Fast Computation of Sparse Datacubes

机译：稀疏数据立方体的快速计算

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Datacube queries compute aggregates over database relations at a variety of granularities, and they constitute an important class of decision support queries. Real-world data is frequently sparse, and hence efficiently computing datacubes over large sparse relations is important. We show that current techniques for computing datacubes over sparse relations do not scale well with the number of CUBE BY attributes, especially when the relation is much larger than main memory.rnWe propose a novel algorithm for the fast computation of datacubes over sparse relations, and demonstrate the efficiency of our algorithm using synthetic, benchmark and real-world data sets. When the relation fits in memory, our technique performs multiple in-memory sorts, and does not incur any I/O beyond the input of the relation and the output of the datacube itself. When the relation does not fit in memory, a divide-and-conquer strategy divides the problem of computing the datacube into several simpler computations of sub-datacubes. Often, all but one of the sub-datacubes can be computed in memory and our in-memory solution applies. In that case, the total I/O overhead is linear in the number of CUBE BY attributes. We demonstrate with an implementation that the CPU cost of our algorithm is dominated by the I/O cost for sparse relations.

机译：Datacube查询以各种粒度计算数据库关系上的聚合，它们构成了一类重要的决策支持查询。现实世界中的数据经常是稀疏的，因此有效地计算大型稀疏关系上的数据立方体非常重要。我们证明了当前用于计算稀疏关系上的数据多维数据集的技术不能很好地扩展CUBE BY属性的数量，尤其是当该关系远大于主内存时。rn我们提出了一种新的算法来快速计算稀疏关系上的数据多维数据集，并且使用综合，基准和实际数据集来证明我们算法的效率。当关系适合内存时，我们的技术会执行多种内存排序，并且不会在关系的输入和数据多维数据集本身的输出之外产生任何I / O。当关系不适合内存时，分而治之策略将计算数据多维数据集的问题分为几个更简单的子数据多维数据集计算。通常，除了一个子数据多维数据集外，其他所有子数据多维数据集都可以在内存中计算，并且我们的内存中解决方案适用。在这种情况下，总I / O开销与CUBE BY属性的数量成线性关系。我们以一种实现方式证明，算法的CPU开销由稀疏关系的I / O开销决定。

著录项

来源
《Proceedings of the Twenty-third international conference on very large data bases》|1997年|116-125|共10页
会议地点 Athens(GR);Athens(GR)
作者
Kenneth A. Ross; Divesh Srivastava;
展开▼
作者单位

Columbia University;

ATT Labs-Research;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类各种专用数据库;
关键词

相似文献

外文文献
中文文献
专利

1. Fast Monostatic RCS Computation Using the Near-Field Sparse Approximate Inverse and the Multilevel Fast Multipole Algorithm [J] . Delgado Carlos, Catedra Felipe Applied Computational Electromagnetics Society journal . 2020,第7期

机译：使用近场稀疏近似逆和多级快速多极算法的快速单机RCS计算
2. Fast computation of inverse transient analysis for pipeline condition assessment via surrogate modeling with sparse sampling strategy [J] . Xun Wang Mechanical systems and signal processing . 2022,第Jana期

机译：稀疏抽样策略代理建模的流水线状况评估逆瞬态分析的快速计算
3. Fast computation of stationary joint probability distribution of sparse Markov chains [J] . Weiyang Ding, Michael Ng, Yimin Wei Applied numerical mathematics . 2018,第mara期

机译：稀疏Markov链的平稳联合概率分布的快速计算
4. Fast Computation of Sparse Datacubes [C] . International conference on very large data bases . 1997

机译：快速计算稀疏的Datacubes
5. Fast algorithms for sparse matrix inverse computations. [D] . Li, Song. 2009

机译：稀疏矩阵逆计算的快速算法。
6. A Fast and Robust Non-Sparse Signal Recovery Algorithm for Wearable ECG Telemonitoring Using ADMM-Based Block Sparse Bayesian Learning [O] . Yunfei Cheng, Yalan Ye, Mengshu Hou, 2018

机译：基于ADMM的块稀疏贝叶斯学习的可穿戴式ECG远程监控的快速鲁棒非稀疏信号恢复算法
7. Spatio-spectral reconstruction of the multispectral datacube using sparse recovery [O] . Manu Parmar, Steven Lansel, Brian A. Wandell 2008

机译：利用稀疏恢复重建多光谱数据立方体的光谱
8. Fast, Parallelized Computational Approach Based on Sparse LU Factorization, for Predictions of Spatial and Time-Dependent Currents and Voltages in Full-Body Bio-Models [R] . Mishra, A. , Joshi, R. P. , Schoenbach, K. H. , 2006

机译：基于稀疏LU分解的快速并行计算方法，用于全身生物模型中空间和时间依赖电流和电压的预测

Fast Computation of Sparse Datacubes

摘要

著录项

相似文献

相关主题

期刊订阅