首页> 外文会议>International School and Symposium on Advanced Distributed Systems >Lineage Tracing in Mediator-Based Information Integration Systems
【24h】

Lineage Tracing in Mediator-Based Information Integration Systems

机译:基于Mediator的信息集成系统中的谱系追踪

获取原文

摘要

The problem of identifying the data contributed to a query answer is referred to as lineage tracing. While this has been studied extensively in data warehouse systems, it is identified as a research topic in the mediator-based approach to information integration. A main problem in this context is that a mediator does not store data, and hence for query processing and tracing, it has to communicate with the data sources. While this communication could be expensive, the real issue is that in some situations, after a query is being processed, lineage tracing may be more difficult, e.g., when the schema of a source has changed, or may even be impossible, e.g., when a source becomes unavailable. In this paper, we study the lineage tracing problem in mediator-based systems and propose a solution by collecting "enough" data and metadata during query processing so that tracing is possible in such situations.. We have developed a system prototype, called ELIT (for Exploration and Lineage Tracing). To allow more flexibility, ELIT supports lineage tracing in two modes: batch and interactive. Due to the distributed nature of the context, efficiency is of primary concern for practical reasons. We therefore investigate ways to reduce the overhead of lineage tracing in the proposed framework while processing queries. Using some basic query optimization techniques in ELIT, our preliminary experimental results show considerable increase in efficiency. This indicates the proposed ideas in the framework of ELIT could lend themselves to powerful lineage tracing and data analysis tools, by incorporating more sophisticated query optimization techniques.
机译:识别促成了查询答案的数据的问题被称为谱系追踪。虽然这已经在数据仓库系统深入的研究,它被确定为在基于中介的方法来信息集成研究课题。在这方面的一个主要问题是,介体不用于查询处理存储数据,因此和跟踪,它具有与数据源进行通信。虽然这种通信可能是昂贵的,真正的问题是,在某些情况下,正在处理的查询后,世系追踪可能会更加困难,例如,当源的架构已更改,甚至是不可能的,例如,当源变得不可用。在本文中,我们研究了基于中介系统的世系追踪问题,并通过查询处理,以便跟踪是在这种情况下可能在收集“足够”的数据和元数据提出解决方案。我们已经开发了一个系统原型,名为ELIT(勘探及世系追踪)。为了让更多的灵活性,支持ELIT两种模式世系追踪:批处理和交互式。由于环境的分布式特性,效率是实际的原因主要关注的。因此,我们研究的方法来减少世系追踪的开销在拟议的框架,同时处理查询。使用ELIT一些基本的查询优化技术,我们的初步实验结果表明,在效率显着提高。这表明ELIT框架内提出的想法能借自己强大的世系追踪和数据分析工具,通过引入更复杂的查询优化技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号