首页> 外文学位 >Managing scientific workflow provenance.
【24h】

Managing scientific workflow provenance.

机译:管理科学的工作流程出处。

获取原文
获取原文并翻译 | 示例

摘要

An advantage of scientific workflow systems over traditional approaches is their ability to automatically record the provenance (or lineage) of intermediate and final data products generated during workflow execution. The provenance of a data product contains information about how the product was derived, and is crucial for enabling scientists to easily understand, reproduce, and verify scientific results. This work addresses challenges for managing large amounts of provenance information, and describes efficient approaches to model, store, query, visualize, and explore provenance information.;Specifically, a model of provenance is presented that extends the conventional provenance model, supports nested data, and captures fine-grained lineage information. Novel reduction techniques are described to optimize storage size, update time, and query-response time. A high-level query language is proposed to allow nonexperts to easily express provenance graph queries over the model. Query optimization techniques that leverage the storage reductions are described. These optimizations scale with the size of provenance and query complexity, and can also be used in more general settings to efficiently answer a broad range of path queries over labeled, acyclic directed graphs. To further allow users to explore relevant provenance information, a navigation model for provenance is proposed that provides an integrated approach for creating provenance views, navigating between views, and summarizing views. To demonstrate the approaches presented, a Provenance Browser application has been developed and integrated into the Kepler scientific workflow system.
机译:与传统方法相比,科学工作流系统的优势在于它们能够自动记录在工作流执行过程中生成的中间数据和最终数据产品的出处(或血统)。数据产品的来源包含有关产品来源的信息,这对于使科学家能够轻松理解,复制和验证科学结果至关重要。这项工作解决了管理大量起源信息的挑战,并描述了建模,存储,查询,可视化和探索起源信息的有效方法。具体而言,提出了一种起源模型,该模型扩展了常规起源模型,支持嵌套数据,并捕获细粒度的沿袭信息。描述了新颖的缩减技术,以优化存储大小,更新时间和查询响应时间。提出了一种高级查询语言,以允许非专家轻松地在模型上表达出处图查询。描述了利用存储减少的查询优化技术。这些优化与出处的大小和查询的复杂性成比例,并且还可以在更通用的设置中使用,以有效地回答带标签的无环有向图上的各种路径查询。为了进一步允许用户浏览相关的出处信息,提出了一种出处导航模型,该模型提供了一种集成方法来创建出处视图,在视图之间导航和汇总视图。为了演示所介绍的方法,已经开发了一个Provenance Browser应用程序并将其集成到Kepler科学工作流程系统中。

著录项

  • 作者

    Anand, Manish Kumar.;

  • 作者单位

    University of California, Davis.;

  • 授予单位 University of California, Davis.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2010
  • 页码 206 p.
  • 总页数 206
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号