首页> 外文会议>SIGMOD/PODS 2007 >Progressive Optimization in a Shared-Nothing Parallel Database

【24h】

Progressive Optimization in a Shared-Nothing Parallel Database

机译：无共享并行数据库中的渐进优化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Commercial enterprise data warehouses are typically implemented on parallel databases due to the inherent scalability and performance limitation of a serial architecture. Queries used in such large data warehouses can contain complex predicates as well as multiple joins, and the resulting query execution plans generated by the optimizer may be suboptimal due to mis-estimates of row cardinalities. Progressive optimization (POP) is an approach to detect cardinality estimation errors by monitoring actual cardinalities at runtime and to recover by triggering re-optimization with the actual cardinalities measured. However, the original serial POP solution is based on a serial processing architecture, and the core ideas cannot be readily applied to a parallel shared-nothing environment. Extending the serial POP to a parallel environment is a challenging problem since we need to determine when and how we can trigger re-optimization based on cardinalities collected from multiple independent nodes. In this paper, we present a comprehensive and practical solution to this problem, including several novel voting schemes whether to trigger re-optimization, a mechanism to reuse local intermediate results across nodes as a partitioned materialized view, several flavors of parallel checkpoint operators, and parallel checkpoint processing methods using efficient communication protocols. This solution has been prototyped in a leading commercial parallel DBMS.We have performed extensive experiments using the TPC-H benchmark and a real-world database. Experimental results show that our solution has negligible runtime overhead and accelerates the performance of complex OLAP queries by up to a factor of 22.

机译：由于串行体系结构固有的可伸缩性和性能限制，商业企业数据仓库通常在并行数据库上实现。在这样的大型数据仓库中使用的查询可能包含复杂的谓词以及多个联接，并且由于行基数的错误估计，优化器生成的结果查询执行计划可能不是最佳的。渐进式优化（POP）是一种通过在运行时监视实际基数来检测基数估计错误，并通过使用所测得的实际基数触发重新优化来进行恢复的方法。但是，原始的串行POP解决方案基于串行处理体系结构，其核心思想不能轻易应用于并行无共享环境。将串行POP扩展到并行环境是一个具有挑战性的问题，因为我们需要根据从多个独立节点收集的基数来确定何时以及如何触发重新优化。在本文中，我们为这个问题提供了一个全面而实用的解决方案，包括几种新颖的投票方案（是否触发重新优化），一种将跨节点的局部中间结果作为分区的物化视图重用的机制，多种并行检查点运算符，以及使用高效通信协议的并行检查点处理方法。该解决方案已在领先的商业并行DBMS中进行了原型设计。我们使用TPC-H基准测试和真实数据库进行了广泛的实验。实验结果表明，我们的解决方案的运行时开销可以忽略不计，并且可以将复杂的OLAP查询的性能提高多达22倍。

著录项

来源
《SIGMOD/PODS 2007》|2007年|P.809-820|共12页
会议地点
作者
Wook-Shin Han; Jack Ng; Volker Markl; Holger Kache; Mokhtar Kandil;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词
Query optimization; OLAP; Parallel databases; Autonomous computing;

机译：查询优化OLAP并行数据库自主计算;

相似文献

外文文献
中文文献
专利

1. Workload-aware incremental repartitioning of shared-nothing distributed databases for scalable OLTP applications [J] . Joarder Kamal, Manzur Murshed, Rajkumar Buyya Future generation computer systems . 2016,第MARa期

机译：无工作负载的分布式数据库的工作负载感知增量重分区，可扩展的OLTP应用程序
2. Column Store for GWAC: A High-cadence, High-density, Large-scale Astronomical Light Curve Pipeline and Distributed Shared-nothing Database [J] . Wan Meng, Wu Chao, Wang Jing, Publications of the Astronomical Society of the Pacific . 2016,第969期

机译：GWAC的列存储：高节奏，高密度，大规模天文光曲线管道和分布式无共享数据库
3. Performance evaluation of three logging schemes for a shared-nothing database server [J] . Kam-Fai Wong Simulation practice and theory . 1998,第4期

机译：无共享数据库服务器的三种日志记录方案的性能评估
4. Progressive optimization in a shared-nothing parallel database [C] . Wook-Shin Han, Jack Ng, Volker Markl, ACM SIGMOD international conference on Management of data . 2007

机译：无共享并行数据库中的渐进优化
5. Data placement in shared-nothing parallel database systems [D] . Padmanabhan, Sriram 1992

机译：无共享并行数据库系统中的数据放置
6. Progressive Cognitive Impairment Evolving to Dementia Parallels Parieto-Occipital and Temporal Enlargement in Idiopathic Chronic Hydrocephalus: A Retrospective Cohort Study [O] . Paolo Missori, Antonio Currà 2015

机译：特发性慢性脑积水进展为痴呆平行脑枕和颞叶扩大的进行性认知障碍：一项回顾性队列研究
7. Parallelizing Query Optimization on Shared-Nothing Architectures [O] . Trummer, Immanuel, Koch, Christoph 2015

机译：在无共享架构上并行化查询优化
8. High Performance Active Database Management on a Shared-Nothing Parallel Processor [R] . Hanson, E. N. 1998

机译：无共享并行处理器的高性能主动数据库管理

Progressive Optimization in a Shared-Nothing Parallel Database

摘要

著录项

相似文献

相关主题

期刊订阅