Skyline Cardinality for Relational Processing How Many Vectors Are Maximal?

机译：关系处理的天际线基数最大有几个向量？

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The skyline clause—also called the Pareto clause—recently has been proposed as an extension to SQL. It selects the tuples that are Pareto optimal with respect to a set of designated skyline attributes. This is the maximal vector problem in a relational context, but it represents a powerful extension to SQL which allows for the natural expression of on-line analytic processing (OLAP) queries and preferences in queries. Cardinality estimation of skyline sets is the focus in this work. A better understanding of skyline cardinality—and other properties of the skyline—is useful for better design of skyline algorithms, is necessary to extend a query optimizer's cost model to accommodate skyline queries, and helps to understand better how to use skyline effectively for OLAP and preference queries. Within a basic model with assumptions of sparseness of values on attributes' domains and statistical independence across attributes, we establish the expected skyline cardinality for skyline queries. While asymptotic bounds have been previously established, they are not widely known nor applied in skyline work. We show concrete estimates, as would be needed in a cost model, and consider the nature of the distribution of skyline. We next establish the effects on skyline cardinality as the constraints on our basic model are relaxed. Some of the results are quite counter-intuitive, and understanding these is critical to skyline's use in OLAP and preference queries. We consider when attributes' values repeat on their domains, and show the number of skyline is diminished. We consider the effects of having Zipfian distributions on the attributes' domains, and generalize the expectation for other distributions. Last, we consider the ramifications of correlation across the attributes.

机译：最近，有人提出了天际线子句（也称为Pareto子句）作为SQL的扩展。相对于一组指定的天际线属性，它选择帕累托最优的元组。这是关系上下文中的最大向量问题，但是它代表了SQL的强大扩展，可以自然表达在线分析处理（OLAP）查询和查询中的首选项。天际线集的基数估计是这项工作的重点。更好地了解天际线基数以及天际线的其他属性对于更好地设计天际线算法非常有用，对于扩展查询优化器的成本模型以适应天际线查询是必要的，并且有助于更好地了解如何有效地将天际线用于OLAP和偏好查询。在一个基本模型中，假设属性值域的稀疏性和属性间的统计独立性，我们为天际线查询建立了预期的天际线基数。虽然渐近界线已经预先确定，但是它们并没有广为人知，也没有应用于天际线工作中。我们显示了成本模型所需的具体估计，并考虑了天际线分布的性质。接下来，随着对基本模型的约束放宽，我们将确定对天际线基数的影响。其中一些结果是违反直觉的，理解这些结果对于在OLAP和首选项查询中使用skyline是至关重要的。我们考虑何时属性值在其域上重复，并显示天际线数量减少了。我们考虑了Zipfian分布对属性域的影响，并概括了对其他分布的期望。最后，我们考虑各个属性之间的相关性。

著录项

来源
《Foundations of Information and Knowledge Systems》|2004年|P.78-97|共20页
会议地点
作者
Parke Godfrey;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient maximal reverse skyline query processing [J] . Banaei-Kashani Farnoush, Ghaemi Parisa, Movaqar Bahman, Geoinformatica: An international journal of advances of computer science for geographic . 2017,第3期

机译：高效最大反向天际线查询处理
2. Cardinality-Aware Purely Relational XQuery Processor [J] . Sharif Sakr Journal of database management . 2009,第3期

机译：支持基数的纯关系XQuery处理器
3. Processing spatial skyline queries in both vector spaces and spatial network databases [J] . Fatma Mili Computing reviews . 2010,第6期

机译：在向量空间和空间网络数据库中处理空间天际线查询
4. Skyline Cardinality for Relational Processing How Many Vectors Are Maximal? [C] . Parke Godfrey International Symposium . 2004

机译：用于关系处理的地平线基数是最大的载体的最大值？
5. Imaging Hope: A Process-Relational Vision of God as Fellow Sufferer and Servant- Leader in the Life and Work of Cardinal Francis Xavier Nguyen Van Thuân. [D] . Tran, Dung Q. 2015

机译：想象中的希望：上帝作为主教弗朗西斯·泽维尔·阮·范·图恩的生活和工作中的同伴和仆人领袖，与过程相关的愿景。
6. On the automaticity of relational stimulus processing: The (extrinsic) relational Simon task [O] . Adriaan Spruyt, Jan De Houwer -1

机译：关于关系刺激处理的自动化：（外部）关系西蒙任务
7. Group-by skyline query processing in relational engines [O] . Luk, MH, Yiu, ML, Lo, E 2009

机译：关系引擎中的分组天际线查询处理

Skyline Cardinality for Relational Processing How Many Vectors Are Maximal?

摘要

著录项

相似文献

相关主题

期刊订阅