首页> 外文学位 >Rethinking the design and implementation of the I/O software stack for high-performance computing.
【24h】

Rethinking the design and implementation of the I/O software stack for high-performance computing.

机译:重新考虑用于高性能计算的I / O软件堆栈的设计和实现。

获取原文
获取原文并翻译 | 示例

摘要

Current I/O stack for high-performance computing is composed of multiple software layers in order to hide users from complexity of I/O performance optimization. However, the design and implementation of a specific layer is usually carried out separately with limited consideration of its impact on other layers, which could result in suboptimal I/O performance because data access locality is weakened, if not lost, on hard disk, a widely used storage medium in high-end storage systems.;In this dissertation, we experimentally demonstrated such issues in four different layers, including operating system process management layer and MPI-IO middleware layer on compute server side, and parallel file system layer and disk I/O scheduling layer on data server side. This dissertation makes four contributions towards solving each of the issues. First, we propose a data-driven execution model for DualPar to explore opportunity of effective I/O scheduling to alleviate I/O bottleneck via cooperation between the I/O and process schedulers. Its novelty is on the ability to obtain a pool of pre-sorted requests to I/O scheduler in its data-driven execution mode by using process pre-execution and prefetching techniques.;Second, realizing that well-formed locality for an MPI program by using collective I/O can be seriously compromised by non-determinism in process scheduling, we proposed Resonant I/O, to match the data request pattern with the pattern of file striping over multiple data servers to improve disk efficiency.;Third, since the conventional practice for I/O parallelism using file striping may compromise on-disk data access locality, we proposed IOrchestrator scheduling framework which is implemented in PVFS2 parallel file system to improve I/O performance of multi-node storage systems by orchestrating I/O services among programs when such inter-data-server coordination is dynamically determined to be cost effective.;Fourth, we developed iTransformer, a scheme that employs a small SSD to schedule requests for the data on disk. Being less space constrained than with more expensive DRAM, iTransformer can buffer larger amounts of dirty data before writing it back to the disk, or prefetch a larger volume of data in a batch into the SSD. In both cases high disk efficiency can be maintained for highly concurrent requests.
机译:当前用于高性能计算的I / O堆栈由多个软件层组成,以使用户免受I / O性能优化的复杂性的困扰。但是,特定层的设计和实现通常是在单独考虑其对其他层的影响的情况下单独进行的,这可能会导致I / O性能欠佳,因为如果不丢失硬盘上的数据访问本地性,它们的数据访问位置就会减弱。在高端存储系统中广泛使用的存储介质。本文通过实验在四个不同的层上演示了这些问题,包括操作系统进程管理层和计算服务器端的MPI-IO中间件层,以及并行文件系统层和磁盘数据服务器端的I / O调度层。本文为解决每个问题做出了四点贡献。首先,我们为DualPar提出了一个数据驱动的执行模型,以探索有效的I / O调度的机会,以通过I / O与流程调度程序之间的协作来缓解I / O瓶颈。它的新颖之处在于它能够通过使用过程预执行和预取技术以其数据驱动的执行模式来获取对I / O调度程序的预排序请求池。第二,为MPI程序实现结构良好的局部性通过使用集体I / O会严重影响进程调度中的不确定性,我们提出了共振I / O,以将数据请求模式与多个数据服务器上的文件分条模式相匹配,以提高磁盘效率。传统的使用文件条带化进行I / O并行化的做法可能会损害磁盘上的数据访问位置,我们提出了IOrchestrator调度框架,该框架在PVFS2并行文件系统中实现,以通过协调I / O来提高多节点存储系统的I / O性能。当动态确定这种数据服务器间的协作具有成本效益时,程序之间将提供服务。第四,我们开发了iTransformer,该方案采用小型SSD来调度对d的请求ata在磁盘上。与较昂贵的DRAM相比,iTransformer的空间有限,它可以缓冲大量脏数据,然后再将其写回磁盘,或将一批数据批量预取到SSD中。在这两种情况下,对于高并发请求都可以保持高磁盘效率。

著录项

  • 作者

    Zhang, Xuechen.;

  • 作者单位

    Wayne State University.;

  • 授予单位 Wayne State University.;
  • 学科 Engineering Computer.;Computer Science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 162 p.
  • 总页数 162
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号