首页> 外文会议>ACM SIGMOD international conference on management of data >Histograms Reloaded: The Merits of Bucket Diversity
【24h】

Histograms Reloaded: The Merits of Bucket Diversity

机译:直方图重新加载:桶多样性的优点

获取原文

摘要

Virtually all histograms store for each bucket the number of distinct values it contains and their average frequency. In this paper, we question this paradigm. We start out by investigating the estimation precision of three commercial database systems which also follow the above paradigm. It turns out that huge errors are quite common. We then introduce new bucket types and investigate their accuracy when building optimal histograms with them. The results are ambiguous. There is no clear winner among the bucket types. At this point, we (1) switch to heterogeneous histograms, where different buckets of the same histogram possibly are of different types, and (2) design more bucket types. The nice consequence of introducing heterogeneous histograms is that we can guarantee decent upper error bounds while at the same time heterogeneous histograms require far less space than homogeneous histograms.
机译:几乎所有桶的直方图存储,它包含它包含的不同值的数量及其平均频率。在本文中,我们质疑这个范式。我们通过调查三个商业数据库系统的估计精度,该系统也遵循上述范例。事实证明,巨大的错误很常见。然后,我们介绍新的桶类型并在与它们建立最佳直方图时调查它们的准确性。结果是暧昧的。桶类型中没有明确的赢家。此时,我们(1)切换到异构直方图,其中相同直方图的不同桶可能是不同类型的,并且(2)设计更多桶类型。引入异构直方图的良好后果是,我们可以保证体面的上误差界限,同时异构直方图需要比均匀直方图更小的空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号