首页> 外文会议>IEEE International Symposium on Parallel and Distributed Processing >Towards a storage backend optimized for atomic MPI-I/O for parallel scientific applications
【24h】

Towards a storage backend optimized for atomic MPI-I/O for parallel scientific applications

机译:朝向针对并行科学应用的原子MPI-I / O优化的存储后端

获取原文

摘要

Scientific applications are becoming increasingly data-intensive: high-resolution simulations of natural phenomenas, climate modeling, large-scale image analysis, etc. Such applications currently manipulate data volumes in the petabyte scale and with the growing trend of data sizes we are rapidly advancing towards the exabyte scale. In this context, I/O performance has been repeatedly pointed out as a source of bottleneck that negatively impacts the performance of the applications. One of problems that is causing the I/O bottleneck are the I/O access patterns generated by such scientific applications that do not match the I/O access interfaces exposed by the file systems used as underlying storage back-ends. One particularly difficult challenge in this context is the need to efficiently address the I/O needs of scientific applications [1], [2] that partition multi-dimensional domains into overlapping subdomains. The subdomains need to be processed in parallel and then stored in a globally shared file. Since the file is a flat sequence of bytes, subdomains map to non-contiguous regions in the file. Because the subdomains overlap, under concurrent accesses such non-contiguous regions may interleave in an inconsistent fashion if they are not grouped together as a single atomic transaction. Therefore, atomicity of non-contiguous, overlapping reads and writes of data from a shared file is a crucial issue.
机译:科学应用正在变得越来越多的数据密集型:高分辨率模拟自然现象,气候建模,大规模的图像分析等。此类应用目前正在操纵Petabyte规模的数据量,并随着我们正在迅速推进的数据尺寸的日益增长的趋势朝着Exabyte规模。在这种情况下,I / O性能已经反复指出,作为对应用程序性能产生负面影响的瓶颈来源。导致I / O瓶颈的问题之一是由这种科学应用程序生成的I / O访问模式,这些应用程序不匹配作为底层存储后端使用的文件系统所公开的I / O接入接口。在这种情况下一个特别困难的挑战是需要有效地解决科学应用的I / O需要[1],[2]将多维域分区为重叠的子域。需要并行处理子域,然后存储在全局共享文件中。由于该文件是一个字节的平面序列,因此子域地图到文件中的非连续区域。因为子域重叠,在并发访问下,如果它们未被分组为单个原子事务,则这种非连续区域可以以不一致的方式交织。因此,来自共享文件的非连续,重叠读取和数据写入的原子性是一个至关重要的问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号