首页> 外文会议>Twenty-ninth International Conference on Very Large Databases; Sep 9-12, 2003; Berlin, Germany >Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration
【24h】

Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration

机译:星形计算:通过自上而下和自下而上的集成计算冰山多维数据集

获取原文
获取原文并翻译 | 示例

摘要

Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottom-up. The former, represented by the Multi-Way Array Cube (called MultiWay) algorithm [25], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of Apriori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure value satisfies a threshold, called iceberg condition). The latter, represented by two algorithms: BUC [6] and H-Cubing[11], computes the iceberg cube bottom-up and facilitates Apriori pruning. BUC explores fast sorting and partitioning techniques; whereas H-Cubing explores a data structure, H-Tree, for shared computation. However, none of them fully explores multi-dimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous three algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-by's that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms all the previous methods in almost all kinds of data distributions.
机译:数据多维数据集计算是数据仓库中最重要但最昂贵的操作之一。先前的研究开发了两种主要方法,即自上而下与自下而上。前者以多维数组多维数据集(称为MultiWay)算法[25]为代表,同时在多个维度上聚合。但是,在计算冰山多维数据集(仅包含度量值满足阈值的聚合单元的多维数据集,称为冰山条件)时,无法利用Apriori修剪[2]的优势。后者由两种算法表示:BUC [6]和H-Cubing [11],用于计算冰山立方自下而上并促进Apriori修剪。 BUC探索快速排序和分区技术;而H-Cubing探索了用于共享计算的数据结构H-Tree。但是,它们都没有充分探讨多维同时聚合。在本文中,我们提出了一种新方法Star-Cubing,该方法整合了前三种算法的优势并同时在多个维度上进行聚合。它利用星形树结构,扩展了同时聚合方法,并可以对不满足冰山条件的分组依据进行修剪。我们的性能研究表明,Star-Cubing是高效的,并且在几乎所有类型的数据分布中都优于以前的所有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号