首页> 外文期刊>Future generation computer systems >Data reduction in scientific workflows using provenance monitoring and user steering
【24h】

Data reduction in scientific workflows using provenance monitoring and user steering

机译:使用源监控和用户转向的科学工作流程的数据减少

获取原文
获取原文并翻译 | 示例
           

摘要

Scientific workflows need to be iteratively, and often interactively, executed for large input datasets. Reducing data from input datasets is a powerful way to reduce overall execution time in such workflows. When this is accomplished online (i.e., without requiring the user to stop execution to reduce the data, and then resume), it can save much time. However, determining which subsets of the input data should be removed becomes a major problem. A related problem is to guarantee that the workflow system will maintain execution and data consistent with the reduction. Keeping track of how users interact with the workflow is essential for data provenance purposes. In this paper, we adopt the "human-in-the-loop" approach, which enables users to steer the running workflow and reduce subsets from datasets online. We propose an adaptive workflow monitoring approach that combines provenance data monitoring and computational steering to support users in analyzing the evolution of key parameters and determining the subset of data to remove. We extend a provenance data model to keep track of users' interactions when they reduce data at runtime. In our experimental validation, we develop a test case from the oil and gas domain, using a 936-cores cluster. The results on this test case show that the approach yields reductions of 32% of execution time and 14% of the data processed.
机译:科学工作流程需要迭代,通常是交互方式,用于大输入数据集。从输入数据集中减少数据是一种强大的方法,可以减少此类工作流程中的整体执行时间。当这在线完成时(即,不需要用户停止执行以减少数据,然后恢复),它可以节省大量时间。但是,确定输入数据的哪些子集应该被删除成为一个主要问题。相关问题是保证工作流系统将维持与减少一致的执行和数据。跟踪用户如何与工作流程交互对于数据出处目的是必不可少的。在本文中,我们采用“循环”方法,使用户能够转向运行的工作流程并在线从数据集减少子集。我们提出了一种自适应工作流程监视方法,该方法结合了来源数据监视和计算转向,以支持用户分析关键参数的演变并确定要删除的数据子集。当他们在运行时减少数据时,我们扩展了出处数据模型以跟踪用户的交互。在我们的实验验证中,我们使用936芯簇从石油和天然气域中开发一个测试案例。该测试案件的结果表明,该方法的削减了32%的执行时间和14%的数据处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号