...
首页> 外文期刊>Information Sciences: An International Journal >An aggregation algorithm using a multidimensional file in multidimensional OLAP
【24h】

An aggregation algorithm using a multidimensional file in multidimensional OLAP

机译:在多维OLAP中使用多维文件的聚合算法

获取原文
获取原文并翻译 | 示例
           

摘要

Aggregation is an operation that plays a key role in multidimensional OLAP (MOLAP). Existing aggregation methods in MOLAP have been proposed for file structures such as multidimensional arrays. These file structures are suitable for data with uniform distributions, but do not work well with skewed distributions. In this paper, we consider an aggregation method that uses dynamic multidimensional files adapting to skewed distributions. In these multidimensional files, the sizes of page regions vary according to the data density in these regions, and the pages that belong to a larger region are accessed multiple times while computing aggregations. To solve this problem, we first present an aggregation computation model that uses the new notions of disjoint-inclusive partition and induced space filling curves. Based on this model, we then present a dynamic aggregation algorithm. Using these notions, the algorithm allows us to maximize the effectiveness of the buffer-we control the page access order in such a way that a page being accessed can reside in the buffer until the next access. We have conducted experiments to show the effectiveness of our approach. Experimental results for a real data set show that the algorithm reduces the number of disk accesses by up to 5.09 times compared with a naive algorithm. The results further show that the algorithm achieves a near optimal performance (i.e., normalized I/O = 1.01) with the total main memory (needed for the buffer and the result table) less than 1.0% of the database size. We believe our work also provides an excellent formal basis for investigating further issues in computing aggregations in MOLAR (C) 2003 Published by Elsevier Science Inc. [References: 19]
机译:聚合是在多维OLAP(MOLAP)中起关键作用的操作。已经提出了MOLAP中用于诸如多维数组的文件结构的现有聚合方法。这些文件结构适用于具有均匀分布的数据,但不适用于偏斜的分布。在本文中,我们考虑一种聚合方法,该方法使用适合于偏斜分布的动态多维文件。在这些多维文件中,页面区域的大小根据这些区域中的数据密度而变化,并且在计算聚合时多次访问属于较大区域的页面。为了解决这个问题,我们首先提出一个聚集计算模型,该模型使用不相交包含分区和诱导空间填充曲线的新概念。然后,基于此模型,我们提出了一种动态聚合算法。使用这些概念,该算法使我们能够最大程度地提高缓冲区的有效性-我们控制页面访问顺序,以使被访问的页面可以驻留在缓冲区中,直到下一次访问为止。我们进行了实验以证明我们方法的有效性。真实数据集的实验结果表明,与朴素算法相比,该算法最多可将磁盘访问次数减少5.09倍。结果还表明,该算法在总主存储器(缓冲区和结果表所需)小于数据库大小的1.0%的情况下,实现了接近最佳的性能(即,标准化I / O = 1.01)。我们相信,我们的工作也为研究进一步聚合计算中的问题(在MOLAR(C)2003中,由Elsevier Science Inc.出版)提供了良好的正式基础。[参考文献:19]

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号