首页> 外文期刊>Concurrency and Computation >A flexible I/Oarbitration framework for netCDF-based big data processingworkflows on high-end supercomputers
【24h】

A flexible I/Oarbitration framework for netCDF-based big data processingworkflows on high-end supercomputers

机译:高端超级计算机上基于NetCDF的大数据处理工作流的灵活I /仲裁框架

获取原文
获取原文并翻译 | 示例

摘要

Onthe verge of the convergence between high-performance computing and BigData processing,rnit has become increasingly prevalent to deploy large-scale data analytics workloads on high-endrnsupercomputers. Such applications often come in the form of complex workflows with variousrndifferent components, assimilating data from scientific simulations as well as from measurementsrnstreamed from sensor networks, such as radars and satellites. For example, as part ofrnthe Flagship 2020 (post-K) supercomputer project of Japan, RIKEN is investigating the feasibilityrnof a highly accurate weather forecasting system that would provide a real-time outlookrnfor severe guerrilla rainstorms. One of the main performance bottlenecks of this application isrnthe lack of efficient communication among workflow components, which currently takes placernover the parallel file system.In this paper, we present an initial study of a direct communicationrnframework designed for complex workflows that eliminates unnecessary file I/O among components.rnSpecifically, we propose an I/O arbitration layer that provides direct parallel data transferrn(both synchronous and asynchronous) among job components that rely on the netCDF interfacernfor performing I/O operations. Our solution requires only minimal modifications to applicationrncode.Moreover, we propose a configuration file–based approach that allows users to specify therndesired data transfer pattern among workflow components, offering a general solution for differentrnapplication contexts.We present a preliminary evaluation of the proposed framework onrnthe K Computer (running on up to 4800 compute nodes) using RIKEN's experimental weatherrnforecastingworkflow as a case study.
机译:在高性能计算和BigData处理之间融合的边缘,rnit越来越普遍地在高端超级计算机上部署大规模数据分析工作负载。此类应用通常以复杂的工作流程的形式出现,具有各种不同的组件,可以吸收来自科学仿真以及来自传感器网络(如雷达和卫星)的测量结果的数据。例如,作为日本旗舰2020年(后K)超级计算机项目的一部分,RIKEN正在研究可行性的高精确度天气预报系统,该系统将为严重的游击暴雨提供实时预报。该应用程序的主要性能瓶颈之一是工作流组件之间缺乏有效的通信,目前这种情况发生在并行文件系统上。在本文中,我们对用于复杂工作流的直接通信框架进行了初步研究,该框架消除了不必要的文件I /具体地说,我们提出了一个I / O仲裁层,该层在依赖netCDF接口执行作业I / O操作的作业组件之间提供直接并行数据传输(同步和异步)。我们的解决方案只需要对应用程序代码进行最小的修改。此外,我们提出了一种基于配置文件的方法,该方法允许用户在工作流组件之间指定所需的数据传输模式,从而为不同的应用程序上下文提供了通用解决方案。我们对所提出的框架进行了初步评估K计算机(在多达4800个计算节点上运行)以RIKEN的实验性weatherrncasting工作流程为例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号