...
首页> 外文期刊>IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics >Benchmarking attribute cardinality maps for database systems using the TPC-D specifications
【24h】

Benchmarking attribute cardinality maps for database systems using the TPC-D specifications

机译:使用TPC-D规范的数据库系统基准属性基数图

获取原文
获取原文并翻译 | 示例
           

摘要

Benchmarking is an important phase in developing any new software technique because it helps to validate the underlying theory in the specific problem domain. But benchmarking of new software strategies is a very complex problem, because it is difficult (if not impossible) to test, validate and verify the results of the various schemes in completely different settings. This is even more true in the case of database systems because the benchmarking also depends on the types of queries presented to the databases used in the benchmarking experiments. Query optimization strategies in relational database systems rely on approximately estimating the query result sizes to minimize the response time for user-queries. Among the many query result size estimation techniques, the histogram-based techniques are by far the most commonly used ones in modern-day database systems. These techniques estimate the query result sizes by approximating the underlying data distributions, and, thus, are prone to estimation errors. In two recent works , we proposed (and thoroughly analyzed) two new forms of histogram-like techniques called the rectangular and trapezoidal attribute cardinality maps (ACM), respectively, that give much smaller estimation errors than the traditional equi-width and equi-depth histograms currently being used by many commercial database systems. This paper reports how the benchmarking of the Rectangular-ACM (R-ACM) and the Trapezoidal-ACM (T-ACM) for query optimization can be achieved. By conducting an extensive set of experiments using the acclaimed TPC-D benchmark queries and database , we demonstrate that these new ACM schemes are much more accurate than the traditional histograms for query result size estimation. Apart from demonstrating the power of the ACMs, this paper also shows how the TPC-D benchmarking can be achieved using a large synthetic database with many different patterns of synthetic queries, which are representative of a real-world business environment.
机译:基准测试是开发任何新软件技术的重要阶段,因为它有助于验证特定问题领域的基础理论。但是,对新软件策略进行基准测试是一个非常复杂的问题,因为很难(如果不是不可能)在完全不同的环境中测试,验证和验证各种方案的结果。在数据库系统的情况下更是如此,因为基准测试还取决于基准测试实验中使用的对数据库提出的查询类型。关系数据库系统中的查询优化策略依赖于大约估计查询结果的大小,以最大程度地减少用户查询的响应时间。在许多查询结果大小估计技术中,基于直方图的技术是当今数据库系统中最常用的技术。这些技术通过近似基础数据分布来估计查询结果的大小,因此容易产生估计错误。在最近的两篇著作中,我们提出了(并进行了全面分析)两种类似直方图的新技术,分别称为矩形和梯形属性基数图(ACM),它们的估计误差比传统的等宽度和等深度小得多。许多商业数据库系统当前正在使用的直方图。本文报告了如何实现用于查询优化的矩形ACM(R-ACM)和梯形ACM(T-ACM)的基准测试。通过使用著名的TPC-D基准查询和数据库进行广泛的实验,我们证明了这些新的ACM方案比用于查询结果大小估计的传统直方图要精确得多。除了演示ACM的功能之外,本文还展示了如何使用大型的综合数据库来实现TPC-D基准测试,该数据库具有许多不同的综合查询模式,这些模式代表了真实的业务环境。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号