首页> 外文OA文献 >Provenance support for service-based infrastructure
【2h】

Provenance support for service-based infrastructure

机译:针对基于服务的基础架构的源代码支持

摘要

Service-based architectures represent the next evolutionary step in the development of e-science, namely, the transformation of the Internet from a commercial marketplace to a mechanism for sharing multidisciplinary scientific resources. Although scientists in many disciplines have become increasingly reliant on distributed computing technologies for data processing and dissemination, the record of the processing history and origin of a data product, that is its data provenance, is often nonexistent, incomplete or impossible to recover by potential users. This thesis aims to address data provenance issues in service-based environments, particularly to answer how a scientist who performs a workflow execution in such an environment can (1) document the data provenance for a data item created by the execution, and (2) use the provenance documentation as a recipe to re-execute the workflow. This thesis pro poses a provenance model for delivering data provenance support in a service-based environment. Through the use of an example scenario of a scientific workflow in the Astrophysics domain, we explore and identify components of the provenance model. The provenance model proposes a technique to collect and record data provenance for service-based workflow executions. The technique facilitates the collection of data provenance of workflow execution at runtime. In order to record the collected data provenance, the thesis also proposes a specification to represent provenance to de scribe the processing history whereby a piece of data was derived. The thesis also proposes query interfaces that allow recorded provenance to be queried, has formulated a technique to construct provenance graphs, and supports the re-execution of past workflows. The provenance representation specification, the collection technique, and the query interfaces have been used to implement a prototype system to demonstrate the proposed model. The thesis also experimentally evaluates the scalability of the components implemented.
机译:基于服务的体系结构代表了电子科学发展中的下一个进化步骤,即Internet从商业市场到共享多学科科学资源的机制的转变。尽管许多学科的科学家越来越依赖于分布式计算技术来进行数据处理和分发,但是数据产品的处理历史和起源记录(即其数据来源)通常不存在,不完整或无法由潜在用户恢复。本文旨在解决基于服务的环境中的数据来源问题,特别是回答在这种环境中执行工作流执行的科学家如何(1)记录由执行创建的数据项的数据来源,以及(2)使用出处文档作为配方来重新执行工作流程。本专业论文提出了一种基于服务的环境中提供数据源支持的源模型。通过使用在天体物理学领域的科学工作流程的示例场景,我们探索并确定了物源模型的组成部分。来源模型提出了一种收集和记录数据来源的技术,以用于基于服务的工作流执行。该技术有助于在运行时收集工作流执行的数据源。为了记录收集到的数据出处,本文还提出了一种表示出处的规范来描述处理历史,由此得出了一条数据。论文还提出了查询界面,可以查询记录的出处,提出了一种构造出处图的技术,并支持过去工作流程的重新执行。来源表示规范,收集技术和查询接口已用于实现原型系统以演示所提出的模型。本文还通过实验评估了所实现组件的可伸缩性。

著录项

  • 作者

    Rajbhandari Shrija;

  • 作者单位
  • 年度 2007
  • 总页数
  • 原文格式 PDF
  • 正文语种 English
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号