首页> 外文会议>ACM SIGMOD international conference on Management of data >Selectivity estimation using probabilistic models
【24h】

Selectivity estimation using probabilistic models

机译:使用概率模型的选择性估计

获取原文

摘要

Estimating the result size of complex queries that involve selection on multiple attributes and the join of several relations is a difficult but fundamental task in database query processing. It arises in cost-based query optimization, query profiling, and approximate query answering. In this paper, we show how probabilistic graphical models can be effectively used for this task as an accurate and compact approximation of the joint frequency distribution of multiple attributes across multiple relations. Probabilistic Relational Models (PRMs) are a recent development that extends graphical statistical models such as Bayesian Networks to relational domains. They represent the statistical dependencies between attributes within a table, and between attributes across foreign-key joins. We provide an efficient algorithm for constructing a PRM front a database, and show how a PRM can be used to compute selectivity estimates for a broad class of queries. One of the major contributions of this work is a unified framework for the estimation of queries involving both select and foreign-key join operations. Furthermore, our approach is not limited to answering a small set of predetermined queries; a single model can be used to effectively estimate the sizes of a wide collection of potential queries across multiple tables. We present results for our approach on several real-world databases. For both single-table multi-attribute queries and a general class of select-join queries, our approach produces more accurate estimates than standard approaches to selectivity estimation, using comparable space and time.

机译:

在数据库查询处理中,估计涉及多个属性的选择和多个关系的联接的复杂查询的结果大小是一项困难但基本的任务。它出现在基于成本的查询优化,查询概要分析和近似查询回答中。在本文中,我们展示了概率图形模型如何可以有效地用于此任务,作为跨多个关系的多个属性的联合频率分布的精确而紧凑的近似值。 概率关系模型(PRM)是最近的一项发展,它将图形统计模型(如贝叶斯网络)扩展到关系域。它们表示表中的属性之间以及外键联接的属性之间的统计依赖性。我们提供了一种用于在数据库前构建PRM的有效算法,并展示了如何将PRM用于计算广泛的查询类别的选择性估计。这项工作的主要贡献之一是用于估计涉及选择和外键联接操作的查询的统一框架。此外,我们的方法不仅限于回答一小组预定的查询;可以使用一个模型来有效地估计跨多个表的大量潜在查询的大小。我们在几个真实的数据库上展示了我们的方法的结果。对于单表多属性查询和一般类的选择联接查询,我们的方法使用可比较的时空,比标准的选择性估计方法产生的估计值更准确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号