首页> 外文期刊>Journal of Parallel and Distributed Computing >Scalable real-time OLAP on cloud architectures
【24h】

Scalable real-time OLAP on cloud architectures

机译:在云架构上可扩展的实时OLAP

获取原文
获取原文并翻译 | 示例

摘要

In contrast to queries for on-line transaction processing (OLTP) systems that typically access only a small portion of a database, OLAP queries may need to aggregate large portions of a database which often leads to performance issues. In this paper we introduce CR-OLAP, a scalable Cloud based Real-time OLAP system based on a new distributed index structure for OLAP, the distributed PDCR tree. CR-OLAP utilizes a scalable cloud infrastructure consisting of multiple commodity servers (processors). That is, with increasing database size, CR-OLAP dynamically increases the number of processors to maintain performance. Our distributed PDCR tree data structure supports multiple dimension hierarchies and efficient query processing on the elaborate dimension hierarchies which are so central to OLAP systems. It is particularly efficient for complex OLAP queries that need to aggregate large portions of the data warehouse, such as "report the total sales in all stores located in California and New York during the months February-May of all years". We evaluated CR-OLAP on the Amazon EC2 cloud, using the TPC-DS benchmark data set. The tests demonstrate that CR-OLAP scales well with increasing number of processors, even for complex queries. For example, for an Amazon EC2 cloud instance with 16 processors, a data warehouse with 160 million tuples, and a TPC-DS OLAP query stream where each query aggregates between 60% and 95% of the database, CR-OLAP achieved a query latency of below 0.3 s which can be considered a real time response.
机译:与通常只访问数据库一小部分的在线事务处理(OLTP)系统的查询相比,OLAP查询可能需要聚合数据库的大部分,这通常会导致性能问题。在本文中,我们介绍CR-OLAP,这是一种基于可扩展的基于云的实时OLAP系统,它基于OLAP的新分布式索引结构即分布式PDCR树。 CR-OLAP利用由多个商品服务器(处理器)组成的可扩展云基础架构。也就是说,随着数据库大小的增加,CR-OLAP动态增加了处理器数量以保持性能。我们的分布式PDCR树数据结构支持多维层次结构以及对复杂的维度层次结构的高效查询处理,而这些层次结构对于OLAP系统至关重要。对于需要汇总数据仓库大部分内容的复杂OLAP查询,例如“报告每年2月至5月的加利福尼亚和纽约所有商店的总销售额”,此方法特别有效。我们使用TPC-DS基准数据集评估了Amazon EC2云上的CR-OLAP。测试表明,即使对于复杂的查询,CR-OLAP可以随着处理器数量的增加而很好地扩展。例如,对于具有16个处理器的Amazon EC2云实例,具有1.6亿个元组的数据仓库和TPC-DS OLAP查询流,其中每个查询在数据库的60%到95%之间进行聚合,CR-OLAP达到了查询延迟小于0.3 s可以被认为是实时响应。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号