首页> 外文会议>Proceedings of the Twenty-third international conference on very large data bases >Selectivity Estimation Without the Attribute Value Independence Assumption
【24h】

Selectivity Estimation Without the Attribute Value Independence Assumption

机译:没有属性值独立性假设的选择性估计

获取原文
获取原文并翻译 | 示例

摘要

The result size of a query that involves multiple attributes from the same relation depends on these attributes' joint data distribution, i.e., the frequencies of all combinations of attribute values. To simplify the estimation of that size, most commercial systems make the attribute value independence assumption and maintain statistics (typically histograms) on individual attributes only. In reality, this assumption is almost always wrong and the resulting estimations tend to be highly inaccurate. In this paper, we propose two main alternatives to effectively approximate (multi-dimensional) joint data distributions. (a) Using a multi-dimensional histogram, (b) Using the Singular Value Decomposition (SVD) technique from linear algebra. An extensive set of experiments demonstrates the advantages and disadvantages of the two approaches and the benefits of both compared to the independence assumption.
机译:涉及来自同一关系的多个属性的查询的结果大小取决于这些属性的联合数据分布,即属性值的所有组合的频率。为了简化对该大小的估计,大多数商业系统都进行属性值独立性假设并仅维护单个属性的统计信息(通常是直方图)。实际上,这种假设几乎总是错误的,因此得出的估计往往非常不准确。在本文中,我们提出了两种主要选择来有效地近似(多维)联合数据分布。 (a)使用多维直方图,(b)使用线性代数的奇异值分解(SVD)技术。大量的实验证明了这两种方法的优缺点,以及与独立性假设相比,两者的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号