首页> 外文会议>IEEE International Symposium on Parallel Distributed Processing;IPDPS 2009 >Adaptable, metadata rich IO methods for portable high performance IO
【24h】

Adaptable, metadata rich IO methods for portable high performance IO

机译:适用于便携式高性能IO的适应性强,元数据丰富的IO方法

获取原文

摘要

Since IO performance on HPC machines strongly depends on machine characteristics and configuration, it is important to carefully tune IO libraries and make good use of appropriate library APIs. For instance, on current petascale machines, independent IO tends to outperform collective IO, in part due to bottlenecks at the metadata server. The problem is exacerbated by scaling issues, since each IO library scales differently on each machine, and typically, operates efficiently to different levels of scaling on different machines. With scientific codes being run on a variety of HPC resources, efficient code execution requires us to address three important issues: (1) end users should be able to select the most efficient IO methods for their codes, with minimal effort in terms of code updates or alterations; (2) such performance-driven choices should not prevent data from being stored in the desired file formats, since those are crucial for later data analysis; and (3) it is important to have efficient ways of identifying and selecting certain data for analysis, to help end users cope with the flood of data produced by high end codes. This paper employs ADIOS, the adaptable IO system, as an IO API to address (1)-(3) above. Concerning (1), ADIOS makes it possible to independently select the IO methods being used by each grouping of data in an application, so that end users can use those IO methods that exhibit best performance based on both IO patterns and the underlying hardware. In this paper, we also use this facility of ADIOS to experimentally evaluate on petascale machines alternative methods for high performance IO. Specific examples studied include methods that use strong file consistency vs. delayed parallel data consistency, as that provided by MPI-IO or POSIX IO. Concerning (2), to avoid linking IO methods to specific file formats and attain high IO performance, ADIOS introduces an efficient intermediate file format, termed BP, which can be converted, at small -ncost, to the standard file formats used by analysis tools, such as NetCDF and HDF-5. Concerning (3), associated with BP are efficient methods for data characterization, which compute attributes that can be used to identify data sets without having to inspect or analyze the entire data contents of large files.
机译:由于HPC计算机上的IO性能很大程度上取决于计算机的特性和配置,因此仔细调整IO库并充分利用适当的库API至关重要。例如,在当前的petascale机器上,独立IO往往胜过集体IO,部分原因是元数据服务器的瓶颈。由于每个IO库在每台机器上的缩放比例不同,并且通常在不同机器上以不同级别的缩放比例高效运行,因此扩展问题使问题更加严重。随着科学代码在各种HPC资源上运行,有效的代码执行要求我们解决三个重要问题:(1)最终用户应能够以最小的精力在代码更新方面为其代码选择最有效的IO方法。或更改; (2)这种基于性能的选择不应阻止数据以所需的文件格式存储,因为这些对于以后的数据分析至关重要。 (3)重要的是要有有效的方法来识别和选择某些数据进行分析,以帮助最终用户应对高端代码产生的大量数据。本文采用自适应IO系统ADIOS作为IO API,以解决上述(1)-(3)。关于(1),ADIOS可以独立选择应用程序中每组数据使用的IO方法,以便最终用户可以根据IO模式和底层硬件使用表现出最佳性能的IO方法。在本文中,我们还使用ADIOS的此工具在petascale机器上通过实验评估了高性能IO的替代方法。研究的特定示例包括使用强文件一致性与延迟并行数据一致性的方法,如MPI-IO或POSIX IO提供的方法。关于(2),为避免将IO方法链接到特定的文件格式并获得较高的IO性能,ADIOS引入了一种有效的中间文件格式,称为BP,可以低成本将其转换为分析工具使用的标准文件格式。 ,例如NetCDF和HDF-5。关于(3),与BP相关联的是用于数据表征的有效方法,该方法可计算可用于识别数据集的属性,而不必检查或分析大文件的整个数据内容。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号