Estimating the result size of complex queries that involve selection on multiple attributes and the join of several relations is a difficult but fundamental task in database query processing. It arises in cost-based query optimization, query profiling, and approximate query answering. In this paper, we show how probabilistic graphical models can be effectively used for this task as an accurate and compact approximation of the joint frequency distribution of multiple attributes across multiple relations. Probabilistic Relational Models (PRMs) are a recent development that extends graphical statistical models such as Bayesian Networks to relational domains. They represent the statistical dependencies between attributes within a table, and between attributes across foreign-key joins. We provide an efficient algorithm for constructing a PRM front a database, and show how a PRM can be used to compute selectivity estimates for a broad class of queries. One of the major contributions of this work is a unified framework for the estimation of queries involving both select and foreign-key join operations. Furthermore, our approach is not limited to answering a small set of predetermined queries; a single model can be used to effectively estimate the sizes of a wide collection of potential queries across multiple tables. We present results for our approach on several real-world databases. For both single-table multi-attribute queries and a general class of select-join queries, our approach produces more accurate estimates than standard approaches to selectivity estimation, using comparable space and time.
在数据库查询处理中,估计涉及多个属性的选择和多个关系的联接的复杂查询的结果大小是一项困难但基本的任务。它出现在基于成本的查询优化,查询概要分析和近似查询回答中。在本文中,我们展示了概率图形模型如何可以有效地用于此任务,作为跨多个关系的多个属性的联合频率分布的精确而紧凑的近似值。 概率关系模型(PRM) I>是最近的一项发展,它将图形统计模型(如贝叶斯网络)扩展到关系域。它们表示表中的属性之间以及外键联接的属性之间的统计依赖性。我们提供了一种用于在数据库前构建PRM的有效算法,并展示了如何将PRM用于计算广泛的查询类别的选择性估计。这项工作的主要贡献之一是用于估计涉及选择和外键联接操作的查询的统一框架。此外,我们的方法不仅限于回答一小组预定的查询;可以使用一个模型来有效地估计跨多个表的大量潜在查询的大小。我们在几个真实的数据库上展示了我们的方法的结果。对于单表多属性查询和一般类的选择联接查询,我们的方法使用可比较的时空,比标准的选择性估计方法产生的估计值更准确。 P>
机译:自适应主动外观模型纹理建模的概率主成分分析及其在头部姿态估计中的应用
机译:自主驾驶概率学习模型中的数据驱动转换矩阵估计
机译:动态PET的概率图形模型:直接参数映射估计和图像重建的新方法
机译:使用概率模型选择性估计
机译:概率材料建模的选择性激光熔化A357铝合金对激光震动喷丸
机译:基于人口的人体暴露模型的住宅空气交换率的概率估计
机译:使用概率模型的选择性估计