首页> 外文会议>12th Asia Pacific Web Conference (APWeb 2010) >ParaCube: A Scalable OLAP Model Based on Distributed Aggregate Computing with Sibling Cubes
【24h】

ParaCube: A Scalable OLAP Model Based on Distributed Aggregate Computing with Sibling Cubes

机译:ParaCube:基于具有同级多维数据集的分布式聚合计算的可扩展OLAP模型

获取原文
获取原文并翻译 | 示例

摘要

The requirements of OLAP applications increase rapidly by dramatically increased data volume, users, query volume and query complexity. The requirement for shortening update period in data warehouse is another crucial factor for a scalable OLAP application. In this paper, we propose a scalable OLAP prototype to support the query processing with increasing data volume by distributing the whole fact tuples to multiple servers to construct a set of sibling cubes which can be merged together to obtain the whole cube. We employ a light weight distribution policy with fully duplicated dimension tables in each sibling server on the observation of very low proportion of space cost for dimension tables. OLAP query with distributed aggregate functions can be transformed into queries to be performed parallel in sibling servers. For non-distributed computing aggregate functions, such as median, the optimized median aggregate computing algorithm is proposed to reduce transmission volume between servers while computing the global median values. We also present a three-level framework in data warehouse to meet the requirement of shorter update period in "operational business intelligence". An asynchronous tunnel model is proposed to reduce update latency by pre-fetching updated tuples to OLAP processing server. Finally, we set up prototype system ParaCube to evaluate performance in SN (shared-nothing) system and multi-core platforms.
机译:OLAP应用程序的需求通过大大增加数据量,用户,查询量和查询复杂性而迅速增加。缩短数据仓库更新周期的要求是可伸缩OLAP应用程序的另一个关键因素。在本文中,我们提出了一个可扩展的OLAP原型,通过将整个事实元组分布到多个服务器来构建一组可合并在一起以获取整个多维数据集的同级多维数据集,以支持数据量增加的查询处理。在观察到尺寸表的空间成本比例非常低时,我们在每个兄弟服务器中采用了具有完全重复的尺寸表的轻量级分配策略。具有分布式聚合功能的OLAP查询可以转换为要在同级服务器中并行执行的查询。对于诸如中位数之类的非分布式计算集合函数,提出了优化的中值集合计算算法,以在计算全局中值时减少服务器之间的传输量。我们还提出了数据仓库中的三级框架,以满足“运营商业智能”中更新周期较短的要求。提出了异步隧道模型以通过将更新的元组预取到OLAP处理服务器来减少更新延迟。最后,我们建立了原型系统ParaCube以评估SN(无共享)系统和多核平台的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号