首页> 外文会议>Database and Expert Systems Applications >A Probabilistic Approach for Computing Approximate Iceberg Cubes
【24h】

A Probabilistic Approach for Computing Approximate Iceberg Cubes

机译:一种计算近似冰山立方体的概率方法

获取原文

摘要

An iceberg cube is a refinement of a data cube containing the subset of cells whose measure is larger than a given threshold (iceberg condition). Iceberg cubes are well-established tools supporting fast data analysis, as they filter the information contained in classical data cubes to provide the most relevant pieces of information. Although the problem of efficiently computing iceberg cubes has been widely investigated, this task is intrinsically expensive, due to the large amount of data which must be usually dealt with. Indeed, in several application scenarios, efficiency is so crucial that users would benefit from a fast computation of even incomplete iceberg cubes. In fact, an incomplete iceberg cube could support preliminary data analysis by allowing users to focus their explorations quickly and effectively, thus saving large amounts of computational resources. In this paper, we propose a technique for efficiently computing iceberg cubes, possibly trading off the computational efficiency with the completeness of the result. Specifically, we devise an algorithm which employs a probabilistic framework to prevent cells which are probably irrelevant (i.e., which are unlikely to satisfy the iceberg condition) from being computed. The output of our algorithm is an incomplete iceberg cube, which is efficiently computed and prone to be refined, in the sense that the user can decide to go through the computation of the cells which were estimated irrelevant during the previous invocations of the algorithm.
机译:冰山魔方是对数据魔方的改进,其中包含测量值大于给定阈值(冰山条件)的单元格子集。 Iceberg多维数据集是完善的工具,可支持快速数据分析,因为它们可以过滤经典数据立方体中包含的信息以提供最相关的信息。尽管已经广泛研究了有效地计算冰山立方体的问题,但是由于通常必须处理大量数据,因此该任务本质上是昂贵的。确实,在几种应用场景中,效率是如此关键,以至于用户甚至可以从甚至不完整的冰山立方体的快速计算中受益。实际上,不完整的冰山立方体可以通过允许用户快速有效地集中精力进行勘探来支持初步数据分析,从而节省了大量的计算资源。在本文中,我们提出了一种有效地计算冰山立方体的技术,可能会在计算效率与结果完整性之间进行权衡。具体而言,我们设计一种算法,该算法采用概率框架来防止计算可能不相关的单元(即不太可能满足冰山条件的单元)。我们的算法的输出是一个不完整的冰山立方体,可以有效地进行计算并易于完善,从某种意义上来说,用户可以决定对算法的先前调用过程中估计无关的单元格进行计算。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号