HPC global file system performance analysis using a scientific-application derived benchmark

Julian Borrill; Leonid Oliker; John Shalf; Hongzhang Shan; Andrew Uselton

首页> 外文期刊>Parallel Computing >HPC global file system performance analysis using a scientific-application derived benchmark

【24h】

HPC global file system performance analysis using a scientific-application derived benchmark

机译：使用衍生自科学应用程序的基准进行的HPC全局文件系统性能分析

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With the exponential growth of high-fidelity sensor and simulated data, the scientific community is increasingly reliant on ultrascale HPC resources to handle its data analysis requirements. However, to use such extreme computing power effectively, the I/O components must be designed in a balanced fashion, as any architectural bottleneck will quickly render the platform intolerably inefficient. To understand I/O performance of data-intensive applications in realistic computational settings, we develop a lightweight, portable benchmark called MADbench2, which is derived directly from a large-scale cosmic microwave background (CMB) data analysis package. Our study represents one of the most comprehensive I/O analyses of modern parallel file systems, examining a broad range of system architectures and configurations, including Lustre on the Cray XT3, XT4, and Intel Itanium2 clusters; GPFS on IBM Power5 and AMD Opteron platforms; a BlueGene/P installation using GPFS and PVFS2 file systems; and CXFS on the SGI Altix3700. We present extensive synchronous I/O performance data comparing a number of key parameters including concurrency, POSIX- versus MPI-IO, and unique- versus shared-file accesses, using both the default environment as well as highly tuned I/O parameters. Finally, we explore the potential of asynchronous I/O and show that only the two of the nine evaluated systems benefited from MPI-2's asynchronous MPI-IO. On those systems, experimental results indicate that the computational intensity required to hide I/O effectively is already close to the practical limit of BLAS3 calculations. Overall, our study quantifies vast differences in performance and functionality of parallel file systems across state-of-the-art platforms -showing I/O rates that vary up to 75 × on the examined architectures - while providing system designers and computational scientists a lightweight tool for conducting further analysis.

机译：随着高保真传感器和模拟数据的呈指数增长，科学界越来越依赖超大规模HPC资源来满足其数据分析要求。但是，要有效地利用这种极端的计算能力，必须以一种平衡的方式设计I / O组件，因为任何架构瓶颈都会迅速使平台效率低下。为了了解实际计算环境中数据密集型应用程序的I / O性能，我们开发了一种轻巧的便携式基准测试程序，称为MADbench2，它直接来自大规模宇宙微波背景（CMB）数据分析包。我们的研究代表了现代并行文件系统最全面的I / O分析之一，研究了广泛的系统架构和配置，包括Cray XT3，XT4和Intel Itanium2集群上的Lustre； IBM Power5和AMD Opteron平台上的GPFS；使用GPFS和PVFS2文件系统的BlueGene / P安装；和SGI Altix3700上的CXFS。我们提供了广泛的同步I / O性能数据，这些数据使用默认环境以及经过高度调整的I / O参数，比较了多个关键参数，包括并发，POSIX与MPI-IO以及唯一与共享文件访问。最后，我们探索了异步I / O的潜力，并表明，在9个评估的系统中，只有两个受益于MPI-2的异步MPI-IO。在那些系统上，实验结果表明，有效隐藏I / O所需的计算强度已经接近BLAS3计算的实际极限。总体而言，我们的研究量化了跨最新平台的并行文件系统在性能和功能上的巨大差异-在所检查的架构上显示的I / O速率变化高达75×-同时为系统设计师和计算科学家提供了轻量级的轻量级功能进行进一步分析的工具。

著录项

来源
《Parallel Computing》 |2009年第6期|358-373|共16页
作者
Julian Borrill; Leonid Oliker; John Shalf; Hongzhang Shan; Andrew Uselton;
展开▼
作者单位

CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
I/O benchmarking; global parallel file system; cosmic microwave background; CPFS; lustre; CXFS; PVFS2;

机译：I / O基准测试;全局并行文件系统;宇宙微波背景CPFS;光泽;CXFS;PVFS2;

相似文献

外文文献
中文文献
专利

1. Empirical Performance Analysis of HPC Benchmarks Across Variations in Cloud Computing [J] . Sanjay P. Ahuja, Sindhu Mani International journal of cloud applications and computing . 2013,第1期

机译：云计算中各种变化的HPC基准的经验性能分析
2. Performance Evaluation of the CXFS File System on the HPC/Storage Complex for data-intensive computing at the TU Dresden [J] . Kluge Michael Computational Methods in Science and Technologygy . 2018,第3期

机译：德累斯顿工业大学高性能计算/存储复合体上CXFS文件系统的性能评估，用于数据密集型计算
3. Performance Evaluation of the CXFS File System on the HPC/Storage Complex for data-intensive computing at the TU Dresden [J] . Kluge Michael Computational Methods in Science and Technologygy . 2006,第1期

机译：德累斯顿工业大学高性能计算/存储复合体上CXFS文件系统的性能评估，用于数据密集型计算
4. Investigation of leading HPC I/O performance using a scientific-application derived benchmark [C] . Borrill Julian, Oliker Leonid, Shalf John, Supercomputing, 2007. SC '07 . 2007

机译：使用源自科学应用程序的基准调查领先的HPC I / O性能
5. High Performance File System and I/O Middleware Design for Big Data on HPC Clusters [D] . Islam, Nusrat Sharmin 2016

机译：HPC群集上用于大数据的高性能文件系统和I / O中间件设计
6. A Numerical Analysis of the Cooling Performance of a Hybrid Personal Cooling System (HPCS): Effects of Ambient Temperature and Relative Humidity [O] . Pengjun Xu, Zhanxiao Kang, Faming Wang, 2020

机译：混合人体冷却系统（HPCS）冷却性能的数值分析：环境温度和相对湿度的影响
7. HPC Global File System Performance Analysis Using A Scientific-Application Derived Benchmark [O] . Julian Borrill, Leonid Oliker, John Shalf, 2010

机译：使用科学应用程序派生基准的HpC全局文件系统性能分析

HPC global file system performance analysis using a scientific-application derived benchmark

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅