首页> 外文会议>Data Engineering, ICDE, 2009 IEEE 25th International Conference on >Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting
【24h】

Perm: Processing Provenance and Data on the Same Data Model through Query Rewriting

机译:彼尔姆:通过查询重写在同一数据模型上处理来源和数据

获取原文

摘要

Data provenance is information that describes how a given data item was produced. The provenance includes source and intermediate data as well as the transformations involved in producing the concrete data item. In the context of a relational databases, the source and intermediate dataitems are relations, tuples and attribute values. The transformations are SQL queries and/or functions on the relational data items. Existing approaches capture provenance information by extending the underlying data model. This has the intrinsic disadvantage that the provenance must be stored and accessed using a different model than the actual data. In this paper, we present an alternative approach that uses query rewriting to annotate result tuples with provenance information. The rewritten query and its result use the same model and can, thus, be queried, stored and optimized using standard relational database techniques. In the paper we formalize the query rewriting procedures, prove their correctness, and evaluate a first implementation of the ideas using PostgreSQL. As the experiments indicate, our approach efficiently provides provenance information inducing only a small overhead on normal operations.
机译:数据来源是描述给定数据项是如何产生的信息。来源包括源数据和中间数据,以及生成具体数据项所涉及的转换。在关系数据库的上下文中,源和中间数据项是关系,元组和属性值。转换是关系数据项上的SQL查询和/或函数。现有方法通过扩展基础数据模型来捕获来源信息。这具有固有的缺点,即必须使用与实际数据不同的模型来存储和访问来源。在本文中,我们提出了一种替代方法,该方法使用查询重写来用来源信息注释结果元组。重写的查询及其结果使用相同的模型,因此可以使用标准关系数据库技术进行查询,存储和优化。在本文中,我们对查询重写程序进行了形式化处理,证明了它们的正确性,并评估了使用PostgreSQL进行构想的第一个实现。如实验所示,我们的方法有效地提供了出处信息,仅在正常操作上产生了很小的开销。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号