...
首页> 外文期刊>Knowledge and Data Engineering, IEEE Transactions on >Improving Accuracy and Robustness of Self-Tuning Histograms by Subspace Clustering
【24h】

Improving Accuracy and Robustness of Self-Tuning Histograms by Subspace Clustering

机译:通过子空间聚类提高自校正直方图的准确性和鲁棒性

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In large databases, the amount and the complexity of the data calls for data summarization techniques. Such summaries are used to assist fast approximate query answering or query optimization. Histograms are a prominent class of model-free data summaries and are widely used in database systems. So-called look at query-execution results to refine themselves. An assumption with such histograms, which has not been questioned so far, is that they can learn the dataset from scratch, that is—starting with an empty bucket configuration. We show that this is not the case. Self-tuning methods are very sensitive to the initial configuration. Three major problems stem from this. Traditional self-tuning is unable to learn projections of multi-dimensional data, is sensitive to the order of queries, and reaches only local optima with high estimation errors. We show how to improve a self-tuning method significantly by starting with a carefully chosen initial configuration. We propose initialization by dense subspace clusters in projections of the data, which improves both accuracy and robustness of self-tuning. Our experiments on different datasets show that the error rate is typically halved compared to the uninitialized version.
机译:在大型数据库中,数据的数量和复杂性要求数据汇总技术。此类摘要用于辅助快速近似查询回答或查询优化。直方图是一类重要的无模型数据摘要,已广泛用于数据库系统中。所谓的查看查询执行结果以完善自身。到目前为止尚未受到质疑的带有这种直方图的假设是,他们可以从头开始学习数据集,即从空存储桶配置开始。我们证明事实并非如此。自整定方法对初始配置非常敏感。由此产生三个主要问题。传统的自调整无法学习多维数据的投影,对查询的顺序很敏感,并且只能达到局部最优,且估计误差很大。我们展示了如何通过精心选择的初始配置来显着改善自整定方法。我们建议通过数据投影中的密集子空间簇进行初始化,从而提高自调整的准确性和鲁棒性。我们在不同数据集上的实验表明,与未初始化版本相比,错误率通常减半。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号