首页> 外文会议>Data Engineering, ICDE, 2009 IEEE 25th International Conference on >SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases
【24h】

SPROUT: Lazy vs. Eager Query Plans for Tuple-Independent Probabilistic Databases

机译:SPROUT:与元组无关的概率数据库的懒惰查询与急切查询计划

获取原文

摘要

A paramount challenge in probabilistic databases is the scalable computation of confidences of tuples in query results. This paper introduces an efficient secondary-storage operator for exact computation of queries on tuple-independent probabilistic databases. We consider the conjunctive queries without self-joins that are known to be tractable on any tuple-independent database, and queries that are not tractable in general but become tractable on probabilistic databases restricted by functional dependencies. Our operator is semantically equivalent to a sequence of aggregations and can be naturally integrated into existing relational query plans. As a proof of concept, we developed an extension of the PostgreSQL 8.3.3 query engine called SPROUT. We study optimizations that push or pull our operator or parts thereof past joins. The operator employs static information, such as the query structure and functional dependencies, to decide which constituent aggregations can be evaluated together in one scan and how many scans are needed for the overall confidence computation task. A case study on the TPC-H benchmark reveals that most TPC-H queries obtained by removing aggregations can be evaluated efficiently using our operator. Experimental evaluation on probabilistic TPC-H data shows substantial efficiency improvements when compared to the state of the art.
机译:概率数据库中最重要的挑战是查询结果中元组的置信度的可伸缩计算。本文介绍了一种有效的辅助存储运算符,用于在不依赖元组的概率数据库上精确计算查询。我们认为没有自联接的联合查询在任何与元组无关的数据库上都是易于处理的,并且一般而言,不可查询但在受函数依赖性限制的概率数据库上变得易于处理的查询。我们的运算符在语义上等效于一系列聚合,并且可以自然地集成到现有的关系查询计划中。作为概念证明,我们开发了称为SPROUT的PostgreSQL 8.3.3查询引擎的扩展。我们研究了推或拉运算符或其部分通过联接的优化。操作员使用静态信息(例如查询结构和功能依赖性)来决定可以一次扫描一起评估哪些构成汇总,以及整个置信度计算任务需要进行多少次扫描。以TPC-H基准测试为例,发现使用我们的运算符可以有效评估大多数通过删除聚合而获得的TPC-H查询。与现有技术相比,对概率TPC-H数据的实验评估显示出了显着的效率提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号