首页> 外文期刊>Parallel Computing >HPC global file system performance analysis using a scientific-application derived benchmark
【24h】

HPC global file system performance analysis using a scientific-application derived benchmark

机译:使用衍生自科学应用程序的基准进行的HPC全局文件系统性能分析

获取原文
获取原文并翻译 | 示例

摘要

With the exponential growth of high-fidelity sensor and simulated data, the scientific community is increasingly reliant on ultrascale HPC resources to handle its data analysis requirements. However, to use such extreme computing power effectively, the I/O components must be designed in a balanced fashion, as any architectural bottleneck will quickly render the platform intolerably inefficient. To understand I/O performance of data-intensive applications in realistic computational settings, we develop a lightweight, portable benchmark called MADbench2, which is derived directly from a large-scale cosmic microwave background (CMB) data analysis package. Our study represents one of the most comprehensive I/O analyses of modern parallel file systems, examining a broad range of system architectures and configurations, including Lustre on the Cray XT3, XT4, and Intel Itanium2 clusters; GPFS on IBM Power5 and AMD Opteron platforms; a BlueGene/P installation using GPFS and PVFS2 file systems; and CXFS on the SGI Altix3700. We present extensive synchronous I/O performance data comparing a number of key parameters including concurrency, POSIX- versus MPI-IO, and unique- versus shared-file accesses, using both the default environment as well as highly tuned I/O parameters. Finally, we explore the potential of asynchronous I/O and show that only the two of the nine evaluated systems benefited from MPI-2's asynchronous MPI-IO. On those systems, experimental results indicate that the computational intensity required to hide I/O effectively is already close to the practical limit of BLAS3 calculations. Overall, our study quantifies vast differences in performance and functionality of parallel file systems across state-of-the-art platforms -showing I/O rates that vary up to 75 × on the examined architectures - while providing system designers and computational scientists a lightweight tool for conducting further analysis.
机译:随着高保真传感器和模拟数据的呈指数增长,科学界越来越依赖超大规模HPC资源来满足其数据分析要求。但是,要有效地利用这种极端的计算能力,必须以一种平衡的方式设计I / O组件,因为任何架构瓶颈都会迅速使平台效率低下。为了了解实际计算环境中数据密集型应用程序的I / O性能,我们开发了一种轻巧的便携式基准测试程序,称为MADbench2,它直接来自大规模宇宙微波背景(CMB)数据分析包。我们的研究代表了现代并行文件系统最全面的I / O分析之一,研究了广泛的系统架构和配置,包括Cray XT3,XT4和Intel Itanium2集群上的Lustre; IBM Power5和AMD Opteron平台上的GPFS;使用GPFS和PVFS2文件系统的BlueGene / P安装;和SGI Altix3700上的CXFS。我们提供了广泛的同步I / O性能数据,这些数据使用默认环境以及经过高度调整的I / O参数,比较了多个关键参数,包括并发,POSIX与MPI-IO以及唯一与共享文件访问。最后,我们探索了异步I / O的潜力,并表明,在9个评估的系统中,只有两个受益于MPI-2的异步MPI-IO。在那些系统上,实验结果表明,有效隐藏I / O所需的计算强度已经接近BLAS3计算的实际极限。总体而言,我们的研究量化了跨最新平台的并行文件系统在性能和功能上的巨大差异-在所检查的架构上显示的I / O速率变化高达75×-同时为系统设计师和计算科学家提供了轻量级的轻量级功能进行进一步分析的工具。

著录项

  • 来源
    《Parallel Computing》 |2009年第6期|358-373|共16页
  • 作者单位

    CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

    CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

    CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

    CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

    CRD/NERSC, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, MS 50A-1148, Berkeley, CA 94720, United States;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    I/O benchmarking; global parallel file system; cosmic microwave background; CPFS; lustre; CXFS; PVFS2;

    机译:I / O基准测试;全局并行文件系统;宇宙微波背景CPFS;光泽;CXFS;PVFS2;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号