首页> 外文会议>Proceedings of the 2011 ACM international conference on supercomputing. >SRC: Damaris - Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations
【24h】

SRC: Damaris - Using Dedicated I/O Cores for Scalable Post-petascale HPC Simulations

机译:SRC:Damaris-使用专用的I / O内核进行可扩展的千万亿次HPC仿真

获取原文
获取原文并翻译 | 示例

摘要

As we enter the post-petascale era, scientific applications running on large-scale platforms generate increasingly larger amounts of data for checkpointing or offline visualization, which puts current storage systems under heavy pressure. Unfortunately, I/O scalability rapidly fades behind the increasing computation power available, and thereby reduced the overall application performance scalability. We consider the common case of large-scale simulations who alternate between computation phases and I/O phases. Two main approaches have been used to handle these I/O phases: 1) each process writes an individual file, leading to a very large number of files from which it is hard to retrieve scientific insights; 2) processes synchronize and use collective I/O to write to the same shared file. In both cases, because of mandatory communications betweens processes during the computation phase, all processes enter the I/O phase at the same time, which leads to huge access contention and extreme performance variability. Previous research efforts have focused on improving each layer of the I/O stack separately: at the highest level scientific data formats like HDF5 [4] allow to keep a high degree of semantics within files, while leveraging MPI-IO optimizations. Parallel file systems like GPFS [5] or PVFS [2] are also subject to optimization efforts, as they usually represent the main bottleneck of this I/O stack. As a step forward, we introduce Damaris (Dedicated Adaptable Middleware for Application Resources Inline Steering), an approach targeting large-scale multicore SMP supercomputers. The main idea is to dedicate one or a few cores on each node to I/O and data processing to provide an efficient, scalable-by-design, in-compute-node data processing service. Damaris takes into account user-provided information related to the application, the file system and the intended use of the datasets to better schedule data transfers and processing. It may also respond to visualization tools to allow in-situ visualization without impacting the simulation. We tested our implementation of Damaris as an I/O back-end for the CM1 atmospheric model1, one of the application intended to run on next generation supercomputer Blue Waters [1] at NCSA. CM1 is a typical MPI application, originally writing one file per process at each checkpoint using HDF5. Deployed on 1024 cores on BluePrint, the Blue-Water's interim system at NCSA with GPFS as underlying filesystem, this approach induces up to 10 seconds overhead in checkpointing phases every 2 minutes, with a high variability in the time spent by each process to write its data (from 1 to 10 seconds). Using one dedicated I/O core in each 16-cores SMP node, we completely remove this overhead. Moreover, the time spared by the I/O core enables a better compression level, thus reducing both the number of files produced (by a factor of 16) and the total data size. Experiments conducted on the French Grid5000 [3] testbed with PVFS as underlying filesystem and a 24 coresode cluster emphasized the benefit of our approach, which allows communication and computation to overlap, in a context involving high network contention at multiple levels.
机译:随着我们进入后千万亿次时代,在大型平台上运行的科学应用程序会生成越来越多的数据用于检查点或脱机可视化,这使当前的存储系统承受沉重的压力。不幸的是,I / O可伸缩性在可用的计算能力不断增加的背后迅速消失,从而降低了整体应用程序性能的可伸缩性。我们考虑在计算阶段和I / O阶段之间交替进行的大规模仿真的常见情况。已经使用了两种主要方法来处理这些I / O阶段:1)每个进程都写入一个单独的文件,从而导致很难从中检索科学见解的大量文件; 2)进程同步并使用集体I / O写入同一共享文件。在这两种情况下,由于在计算阶段过程之间必须进行强制通信,因此所有过程都同时进入I / O阶段,这导致巨大的访问争用和极大的性能可变性。以前的研究工作集中在分别改进I / O堆栈的每一层:在最高级别的科学数据格式(如HDF5 [4])可以在利用MPI-IO优化的同时,在文件中保留高度的语义。诸如GPFS [5]或PVFS [2]之类的并行文件系统也需要进行优化,因为它们通常表示此I / O堆栈的主要瓶颈。作为向前的一步,我们介绍了Damaris(用于应用程序资源在线指导的专用自适应中间件),这是一种针对大型多核SMP超级计算机的方法。主要思想是将每个节点上的一个或几个内核专用于I / O和数据处理,以提供有效的,可按设计扩展的计算中节点数据处理服务。 Damaris考虑了用户提供的与应用程序,文件系统以及数据集的预期用途有关的信息,以更好地计划数据传输和处理。它还可以响应可视化工具,以实现原位可视化而不会影响模拟。我们测试了将Damaris作为CM1大气模型1的I / O后端的实现,该应用旨在在NCSA的下一代超级计算机Blue Waters [1]上运行。 CM1是典型的MPI应用程序,最初使用HDF5在每个检查点为每个进程写入一个文件。这种方法部署在BlueSA的临时打印系统BluePrint(在NCSA上的临时系统,使用GPFS作为底层文件系统)上的1024个内核上,每2分钟在检查点阶段产生多达10秒钟的开销,并且每个进程花费的时间差异很大。数据(1到10秒)。在每个16核SMP节点中使用一个专用的I / O核,我们完全消除了这一开销。此外,I / O内核节省的时间可实现更好的压缩级别,从而减少了生成的文件数量(减少了16倍)和总数据大小。在法国Grid5000 [3]测试平台上进行的实验以PVFS作为底层文件系统和24个核心/节点群集,强调了我们方法的优势,该方法允许在涉及多个级别高网络争用的情况下进行通信和计算重叠。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号