首页> 外文会议>IEEE International Symposium on High Performance Distributed Computing >Towards a Workflow-Aware Distributed Versioning File System for Metacomputing Systems
【24h】

Towards a Workflow-Aware Distributed Versioning File System for Metacomputing Systems

机译:朝向MetaComputing系统的工作流程感知分布式版本控制文件系统

获取原文

摘要

Traditional distributed file systems (DFSs) are often adopted as backplanes to facilitate the data accesses of workflow-based computations in metacomputing systems. For example, during the computation of each job, files may be read from previously completed jobs and written to be consumed by the later ones according to data dependencies. However, traditional DFSs, which are built on top of local tree-structured file systems, are not targeted at this particular computing environment. For example, in practice, the data dependency, also named dataflow information, is quite useful for the scheduler to effectively exploit the concurrency among the jobs during the scheduling process. Unfortunately, traditional DFSs have no effective ways to track such information, leaving this burden to the users or schedulers if they intend to use it. To use a metacomputing system, users usually take the workflow instance as a unit to consider their computation. In other words, they need a distinct namespace for each instance, especially when these instances are executed concurrently. In this scenario, sharing the jobs' name and resolving the conflicts between the names of data files are required. Furthermore, if one instance shares some information with other previously completed instance, only di f f(the different parts) needs to be stored for this instance by taking the previous instance as the base. Although traditional DFSs have some approaches to the above requirements (e.g., we can use directories to simulate the different namespaces, use symbolic link to share a file between different namespaces, and use the tree structure to achieve the di f f mechanism), these approaches are neither elegant nor effective due to the imposed semantics on the directories, preventing them from being modified or moved freely. Furthermore, using symbolic links to share a file is not effective, especially when two files reside at different computational sites.
机译:传统的分布式文件系统(DFSS)通常是背板采用的,以便于MetaComputing系统中基于工作流的计算数据访问。例如,在计算每个作业的计算期间,可以根据先前完成的作业读取文件,并根据数据依赖性写入以由稍后的作业进行消耗。但是,在本地树结构文件系统之上构建的传统DFSS不会针对此特定计算环境。例如,在实践中,数据依赖性,也命名DataFlow信息,对于调度程序非常有用,对于在调度过程期间有效地利用作业之间的并发性。不幸的是,传统的DFSS没有有效的方法来跟踪这些信息,如果他们打算使用它,将这种负担留给用户或调度。要使用MetaComputing系统,用户通常将工作流实例作为一个单元以考虑其计算。换句话说,它们需要每个实例的不同命名空间,尤其是当这些实例同时执行时。在此方案中,共享作业的名称并在需要在数据文件的名称之间解决冲突。此外,如果一个实例与其他先前完成的实例共享一些信息,则只需将前一个实例作为基础存储此实例,只需要为此实例存储DI F F(不同的部件)。虽然传统的DFSS对上述要求有一些方法(例如,我们可以使用目录来模拟不同的名称空间,但使用符号链接在不同名称空间之间共享文件,并使用树结构实现DI FF机制),这些方法是由于目录上强加的语义,既不优雅也不有效,防止它们被修改或自由移动。此外,使用符号链接以共享文件无效,特别是当两个文件驻留在不同的计算站点时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号