Dataflow detection and applications to workflow scheduling

Yang Wang; Paul Lu

首页> 外文期刊>Concurrency, practice and experience >Dataflow detection and applications to workflow scheduling

【24h】

Dataflow detection and applications to workflow scheduling

机译：数据流检测及其在工作流调度中的应用

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In high-performance computing (HPC) workloads (i.e. the set of computations to be completed), the same computational workflow of jobs (e.g. a Pipeline, a Fork&Join, or a Lattice graph) may be applied to different input files and parameters. Each of these workflow instances has the same workflow shape, but accesses (possibly) separate input, intermediate, and output files. Therefore, the selective isolation of each workflow instance can be important for maximizing scheduling flexibility and performance. However, in practice, realizing this benefit is not obvious due to a variety of problems and constraints. For example, the unmediated interaction of different workflow instances can lead to a problem of filename conflicts between concurrent workflow instances overwriting common files, which, for a control-flow driven batch scheduler, may result in either unsafe computation of the multiple instances in the same sub-directory or storage overheads when multiple directories are used. We propose a novel approach of selectively coupling and integrating job schedulers and file systems, known as a Workflow-aware File System (WaFS), with two major benefits. First, separate namespaces can be constructed on a per-instance basis to maximize the concurrency of workflow instances, despite filename conflicts, while minimizing storage overhead. Second, exploiting inferred dataflow information, trade-offs can be made between makespan and storage overhead while maintaining correctness. Through a simulation-based study, we have shown the potential benefits of WaFS to job concurrency and we have characterized the trade-offs that can be made between storage overhead and performance. New scheduling policies, Versioned Namespace (VNS), Overwrite-Safe Concurrency (OSC) and hybrids, are made possible by WaFS, with different advantages and disadvantages. Copyright

机译：在高性能计算（HPC）工作负载（即要完成的一组计算）中，相同的作业计算工作流（例如管道，Fork＆Join或格形图）可以应用于不同的输入文件和参数。这些工作流程实例中的每一个都具有相同的工作流程形状，但是（可能）访问单独的输入，中间和输出文件。因此，每个工作流实例的选择性隔离对于最大化调度灵活性和性能可能很重要。但是，实际上，由于各种问题和限制，实现这种好处并不明显。例如，不同工作流实例的无中介交互可能导致并发工作流实例覆盖通用文件之间文件名冲突的问题，对于控制流驱动的批处理调度程序，这可能导致同一实例中多个实例的不安全计算使用多个目录时的子目录或存储开销。我们提出了一种选择性地耦合和集成作业调度程序和文件系统（称为工作流程感知文件系统（WaFS））的新颖方法，它具有两个主要优点。首先，可以在每个实例的基础上构造独立的名称空间，以最大程度地提高工作流实例的并发性，尽管文件名发生冲突，同时又可以最大程度地减少存储开销。其次，利用推断的数据流信息，可以在保持正确性的同时，在制造期和存储开销之间进行权衡。通过基于仿真的研究，我们展示了WaFS对作业并发的潜在好处，并且我们描述了可以在存储开销和性能之间进行权衡的特征。 WaFS使得新的调度策略（版本命名空间（VNS），覆盖安全并发（OSC）和混合）成为可能，它们各有优缺点。版权

著录项

来源
《Concurrency, practice and experience》 |2011年第11期|p.1261-1283|共23页
作者
Yang Wang; Paul Lu;
展开▼
作者单位

Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8,Yang Wang,National University of Singapore,Singapore;

Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada T6G 2E8;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
dataflow; concurrency; storage;

机译：数据流;并发存储;

相似文献

外文文献
中文文献
专利

1. Dataflow-Based Scheduling for Scientific Workflows in HPC with Storage Constraints [J] . Yang Wang, Wei Shi The Computer journal . 2015,第7期

机译：具有存储约束的HPC中基于数据流的科学工作流调度
2. A Framework to Schedule Parametric Dataflow Applications on Many-Core Platforms [J] . Vagelis Bebelis, Pascal Fradet, Alain Girault ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2014,第5期

机译：在多核平台上调度参数数据流应用程序的框架
3. A programming model for Hybrid Workflows: Combining task-based workflows and dataflows all-in-one [J] . Cristian Ramon-Cortes, Francesc Lordan, Jorge Ejarque, Future generation computer systems . 2020,第Deca期

机译：混合工作流程编程模型：将基于任务的工作流和数据流组合到一体化
4. Managing a complicated workflow based on dataflow-based workflow scheduler [C] . Teruyoshi Zenmyo, Satoshi Iijima, Ichiro Fukuda IEEE International Congress on Big Data . 2016

机译：基于基于数据流的工作流计划程序管理复杂的工作流
5. Transparent dataflow detection and use in workflow scheduling: Concurrency and deadlock avoidance. [D] . Wang, Yang. 2008

机译：透明的数据流检测和在工作流调度中的使用：并发和避免死锁。
6. Expectations and solutions for HIS/RIS/PACS dataflow and workflow [O] . Werner Offenmüller 1997

机译：HIS / RIS / PACS数据流和工作流程的期望和解决方案
7. A GPU scheduling framework for applications based on dataflow specification [O] . Yongbin Lee, Sungchan Kim 2014

机译：基于DataFlow规范的应用程序的GPU调度框架

Dataflow detection and applications to workflow scheduling

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅