首页> 外文学位 >Design and implementation of a parallel I/O runtime system for irregular applications.
【24h】

Design and implementation of a parallel I/O runtime system for irregular applications.

机译:针对非常规应用程序的并行I / O运行时系统的设计和实现。

获取原文
获取原文并翻译 | 示例

摘要

Many scientific applications are I/O intensive and have tremendous I/O requirements, including checkpointing, periodically writing snapshots of computations. Especially, large number of these applications exhibits irregular access patterns, where accesses to data are performed through one or more level of indirections.; A typical computation science analysis cycle for these applications involves several steps: mesh generation, domain decomposition, simulation, visualization, archival of data, and adjustment of parameters. Therefore, two main focus must be considered. The first one is to store data set in a canonical form so that other steps can use it easily without having to reorganize. The second one is that, for a restart of computation with different number of processors, data set should be stored independent of number of processors that produced it.; In this dissertation, we present the design, implementation and evaluation of two parallel I/O runtime systems based on collective I/O techniques for irregular applications. The design is motivated by the requirements of a large number of science and engineering applications including teraflops, applications. The first library has been implemented on top of parallel file systems on MPPs. The user application links to the library's client API that issues I/O requests using the I/O commands supported by the parallel file systems. In this library, we designed and implemented two kinds of collective I/O schemes; "Collective I/O" and "Pipelined Collective I/O". In the "Collective I/O", all processors participate in the I/O simultaneously, and in the "Pipelined Collective I/O", I/O is overlapped with communication by making processor groups. As an optimization, chunking and on-line compression mechanisms are included in the both collective I/O schemes. The second library has been implemented on workstation clusters, called "Collective I/O Clustering". This library is based on the client-I/O server model. The I/O architecture of workstation clusters usually relies on a set of I/O servers, having local disks, and a set of diskless nodes. By using the local file system running at each node, such as UNIX, we developed a collective I/O scheme that supports irregular problems. In this library, two I/O configurations are possible. In the first configuration, all nodes have their local disk, thus the data in a client I/O buffer can go to its local disk, removing the communication between the clients and I/O servers. In the second configuration, only subset of nodes have their disk and can serve the incoming I/O requests from the clients. In this environment, the client sends the data to the appropriate I/O server, thus the communication latency between the client and I/O servers should be addressed to improve I/O performance. To optimize the communication latency, we provide a user-controllable stripe technique. Both the library and user applications can control this stripe unit. As did on MPPs, this library is incorporated with compression scheme to optimize I/O costs.; In this dissertation, the performance results on large-scale parallel systems including the Intel Paragon at Caltech, and ASCI/Red Teraflops at Sandia National Labs are presented. The results for the collective I/O clustering on IBM-SP located at Argonne National Labs are also presented.
机译:许多科学应用程序都是I / O密集型的,并且对I / O的要求很高,包括检查点,定期编写计算快照。特别是,大量的这些应用程序表现出不规则的访问模式,其中对数据的访问是通过一个或多个间接级别进行的。这些应用程序的典型计算科学分析周期涉及几个步骤:网格生成,域分解,模拟,可视化,数据归档和参数调整。因此,必须考虑两个主要重点。第一个是以规范形式存储数据集,以便其他步骤可以轻松使用它而无需重新组织。第二个问题是,为了使用不同数量的处理器重新开始计算,应独立于产生该数据的处理器数量来存储数据集。本文针对不规则应用提出了两种基于集合I / O技术的并行I / O运行时系统的设计,实现和评估。该设计是受大量科学和工程应用程序(包括teraflops和应用程序)的需求激励的。第一个库已在MPP上的并行文件系统之上实现。用户应用程序链接到库的客户端API,该库使用并行文件系统支持的I / O命令发出I / O请求。在这个库中,我们设计和实现了两种集体的I / O方案: “集体I / O”和“管道式集体I / O”。在“集体I / O”中,所有处理器同时参与I / O,而在“管道式集体I / O”中,I / O通过构成处理器组与通信重叠。作为一种优化,在两个集体I / O方案中都包括了分块和在线压缩机制。第二个库已在工作站群集上实现,称为“集体I / O群集”。该库基于客户端I / O服务器模型。工作站群集的I / O体系结构通常依赖于一组具有本地磁盘和一组无盘节点的I / O服务器。通过使用在每个节点(例如UNIX)上运行的本地文件系统,我们开发了一种支持不规则问题的集体I / O方案。在该库中,可能有两种I / O配置。在第一种配置中,所有节点都有其本地磁盘,因此客户端I / O缓冲区中的数据可以转到其本地磁盘,从而消除了客户端与I / O服务器之间的通信。在第二种配置中,只有节点的子集才具有其磁盘,并且可以满足来自客户端的传入I / O请求。在这种环境中,客户端会将数据发送到适当的I / O服务器,因此应解决客户端与I / O服务器之间的通信延迟问题,以提高I / O性能。为了优化通信延迟,我们提供了一种用户可控制的条带技术。库和用户应用程序都可以控制该条带单元。就像在MPP上所做的一样,该库与压缩方案合并在一起以优化I / O成本。本文介绍了大型并行系统的性能结果,包括Caltech的Intel Paragon和Sandia National Labs的ASCI / Red Teraflops。还介绍了位于Argonne国家实验室的IBM-SP上的集体I / O集群的结果。

著录项

  • 作者

    No, Jaechun.;

  • 作者单位

    Syracuse University.;

  • 授予单位 Syracuse University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 1999
  • 页码 155 p.
  • 总页数 155
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号