首页> 外文期刊>Concurrency and computation: practice and experience >In-memory staging and data-centric task placement for coupled scientific simulation workflows
【24h】

In-memory staging and data-centric task placement for coupled scientific simulation workflows

机译:内存中暂存和以数据为中心的任务放置,用于耦合的科学模拟工作流

获取原文
获取原文并翻译 | 示例

摘要

Coupled scientific simulation workflows are composed of heterogeneous component applications that simulate different aspects of the physical phenomena being modeled and that interact and exchange significant volumes of data at runtime. As the data volumes and generation rates keep growing, the traditional disk I/O-based data movement approach becomes cost prohibitive, and workflow requires more scalable and efficient approach to support the data movement. Moreover, the cost of moving large volume of data over system interconnection network becomes dominating and significantly impacts the workflow execution time. Minimize the amount of network data movement and localize data transfers are critical for reducing such cost. To achieve this, workflow task placement should exploit data locality to the extent possible and move computation closer to data. In this paper, we investigate applying in-memory data staging and data-centric task placement to reduce the data movement cost in large-scale coupled simulation workflows. Specifically, we present a distributed data sharingand task execution framework that (1) co-locates in-memory data staging on application compute nodes to store data that needs to be shared or exchanged and (2) uses data-centric task placement to map computations onto processor cores that a large portion of the data exchanges can be performed using the intra-node shared memory. We also presentthe implementation of the framework and its experimental evaluation on Titan CrayXK7 petascale supercomputer.
机译:耦合的科学模拟工作流由异构组件应用程序组成,这些组件应用程序模拟要建模的物理现象的不同方面,并且在运行时进行交互和交换大量数据。随着数据量和生成速率的不断增长,传统的基于磁盘I / O的数据移动方法变得成本过高,并且工作流需要更具可扩展性和效率的方法来支持数据移动。而且,在系统互连网络上移动大量数据的成本变得占优势,并显着影响工作流程的执行时间。最小化网络数据移动量并本地化数据传输对于降低此类成本至关重要。为此,工作流任务放置应尽可能利用数据局部性,并使计算更接近数据。在本文中,我们研究了应用内存中数据分期和以数据为中心的任务放置以减少大规模耦合模拟工作流中的数据移动成本。具体而言,我们提出了一种分布式数据共享和任务执行框架,该框架(1)将内存中的数据分段放置在应用程序计算节点上以存储需要共享或交换的数据,并且(2)使用以数据为中心的任务放置来映射计算在处理器内核上,可以使用节点内共享内存执行大部分数据交换。我们还介绍了该框架的实现及其在Titan CrayXK7 petascale超级计算机上的实验评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号