ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing

Ciprian Docan; Fan Zhang; Tong Jin; Hoang Bui; Qian Sun; Julian Cummings; Norbert Podhorszki; Scott Klasky; Manish Parashar

首页> 外文期刊>Concurrency and computation: practice and experience >ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing

【24h】

ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing

机译：ActiveSpaces：探索用于极端规模数据处理的动态代码部署

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Managing the large volumes of data produced by emerging scientific and engineering simulations running onrnleadership-class resources has become a critical challenge. The data have to be extracted off the computingrnnodes and transported to consumer nodes so that it can be processed, analyzed, visualized, archived, and sornon. Several recent research efforts have addressed data-related challenges at different levels. One attractivernapproach is to offload expensive input/output operations to a smaller set of dedicated computing nodesrnknown as a staging area. However, even using this approach, the data still have to be moved from the stagingrnarea to consumer nodes for processing, which continues to be a bottleneck. In this paper, we investigaternan alternate approach, namely moving the data-processing code to the staging area instead of moving therndata to the data-processing code. Specifically, we describe the ActiveSpaces framework, which provides (1)rnprogramming support for defining the data-processing routines to be downloaded to the staging area andrn(2) runtime mechanisms for transporting codes associated with these routines to the staging area, executingrnthe routines on the nodes that are part of the staging area, and returning the results. We also present anrnexperimental performance evaluation of ActiveSpaces using applications running on the Cray XT5 at OakrnRidge National Laboratory. Finally, we use a coupled fusion application workflow to explore the trade-offsrnbetween transporting data and transporting the code required for data processing during coupling, and werncharacterize sweet spots for each option.

机译：管理由新兴的科学和工程模拟运行的领先的领导层资源所产生的大量数据已成为一项严峻的挑战。数据必须从计算节点中提取出来并传输到消费者节点，以便可以对其进行处理，分析，可视化，存档和存储。最近的一些研究工作已经解决了不同级别的数据相关挑战。一种有吸引力的方法是将昂贵的输入/输出操作卸载到较小的一组专用计算节点（称为暂存区）。但是，即使使用这种方法，仍然必须将数据从stagingrrearea移至消费者节点进行处理，这仍然是瓶颈。在本文中，我们研究了另一种方法，即将数据处理代码移至暂存区，而不是将数据移至数据处理代码。具体来说，我们描述ActiveSpaces框架，该框架提供（1）编程支持，用于定义要下载到登台区域的数据处理例程，以及（2）运行时机制，用于将与这些例程相关联的代码传输到登台区域，并在作为登台区域一部分的节点，并返回结果。我们还使用在OakrnRidge国家实验室的Cray XT5上运行的应用程序，对ActiveSpaces进行了实验性性能评估。最后，我们使用耦合融合应用程序工作流来探索在耦合期间传输数据和传输数据处理所需的代码之间的折衷，并为每个选项确定最佳位置。

著录项

来源
《Concurrency and computation: practice and experience》 |2015年第14期|3724-3745|共22页
作者
Ciprian Docan; Fan Zhang; Tong Jin; Hoang Bui; Qian Sun; Julian Cummings; Norbert Podhorszki; Scott Klasky; Manish Parashar;
展开▼
作者单位

NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics Institute, Rutgers University,Piscataway, NJ, USA;

NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics Institute, Rutgers University,Piscataway, NJ, USA;

NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics Institute, Rutgers University,Piscataway, NJ, USA;

NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics Institute, Rutgers University,Piscataway, NJ, USA;

NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics Institute, Rutgers University,Piscataway, NJ, USA;

Department of Computer Science, California Institute of Technology, Pasadena, CA, USA;

Oak Ridge National Laboratory, Tennessee, TN, USA;

Oak Ridge National Laboratory, Tennessee, TN, USA;

NSF Cloud and Autonomic Computing Center, Rutgers Discovery Informatics Institute, Rutgers University,Piscataway, NJ, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
dynamic code deployment; in situ data processing; data-intensive application workflows; coupled simulations;

机译：动态代码部署;原位数据处理;数据密集型应用程序工作流程;耦合模拟;

相似文献

外文文献
中文文献
专利

1. Towards Exploring Data-Intensive Scientific Applications at Extreme Scales through Systems and Simulations [J] . Dongfang Zhao, Ning Liu, Dries Kimpe, IEEE Transactions on Parallel and Distributed Systems . 2016,第6期

机译：通过系统和仿真来探索极端规模的数据密集型科学应用
2. Moving code in spatial data infrastructures - web service based deployment of geoprocessing algorithms [J] . Müller M., Bernard L., Brauner J. Transactions in GIS: TG . 2010,第Suppla期

机译：在空间数据基础架构中移动代码-基于Web服务的地理处理算法部署
3. Time-series processing of large scale remote sensing data with extreme learning machine [J] . Jiaoyan Chen, Guozhou Zheng, Cong Fang, Neurocomputing . 2014,第mara27期

机译：用极限学习机对大规模遥感数据进行时间序列处理
4. Moving the Code to the Data - Dynamic Code Deployment Using ActiveSpaces [C] . Docan Ciprian, Parashar Manish, Cummings Julian, 2011 25th IEEE International Parallel Distributed Processing Symposium . 2011

机译：将代码移至数据-使用ActiveSpaces进行动态代码部署
5. Scalable Systems for Large Scale Dynamic Connected Data Processing [D] . Padmanabha Iyer, Anand . 2019

机译：用于大规模动态连接数据处理的可扩展系统
6. Phenol-Explorer 3.0: a major update of the Phenol-Explorer database to incorporate data on the effects of food processing on polyphenol content [O] . Joseph A. Rothwell, Jara Perez-Jimenez, Vanessa Neveu, 2013

机译：Phenol-Explorer 3.0：Phenol-Explorer数据库的重大更新以合并有关食品加工对多酚含量的影响的数据
7. ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing [O] . Ciprian Docan, Fan Zhang, Tong Jin, 2014

机译：ActiveSpace：探索极限数据处理的动态代码部署
8. Development of the ANL Plant Dynamics Code and Control Strategies for the Supercritical Carbon Dioxide Brayton Cycle and Code Validation with Data from the Sandia Small-Scale Supercritical Carbon Dioxide Brayton Cycle Test Loop. [R] . A. Sienicki B. Moisseytsev 2011

机译：利用桑迪亚小型超临界二氧化碳布雷顿循环试验环数据开发超临界二氧化碳布雷顿循环和代码验证的aNL工厂动力学规范和控制策略。

ActiveSpaces: Exploring dynamic code deployment for extreme scale data processing

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅