首页> 外文期刊>Future generation computer systems >Tiered data management system: Accelerating data processing on HPC systems
【24h】

Tiered data management system: Accelerating data processing on HPC systems

机译:分层数据管理系统:加速HPC系统上的数据处理

获取原文
获取原文并翻译 | 示例

摘要

The explosion of scientific data generated from large-scale simulations and advanced sensors makes scientific workflows more complex and more data-intensive. Supporting these data-intensive workflows on high-performance computing systems presents new challenges in data management due to their scales, coordination behaviours, and overall complexities. In this paper, we propose Tiered Data Management System (TDMS) to accelerate scientific workflows on HPC systems. TDMS prevent repetitive data movement by providing efficient data sharing on top of tiered storage architecture. The customized data management for common workflow access patterns allows users to make full use of the advantages of different storage tiers. The extended application interface, which supports user-defined data management strategies, strengthens its ability to handle diverse storage architectures and application scenarios. Moreover, we propose a data-aware task scheduling module to launch tasks on compute nodes where the data locality of required data can be leveraged maximally. We build a prototype and deploy it on a typical HPC system. We evaluate the performance of TDMS with realistic workflows and the experiments show that the TDMS can optimize the I/O performance and provide up to 1.54x speedup for data-intensive workflows compared with Lustre file system. (C) 2019 Elsevier B.V. All rights reserved.
机译:大规模仿真和高级传感器生成的科学数据的爆炸式增长使科学工作流程变得更加复杂和数据密集。由于它们的规模,协调行为和整体复杂性,在高性能计算系统上支持这些数据密集型工作流提出了数据管理方面的新挑战。在本文中,我们提出了分层数据管理系统(TDMS)以加速HPC系统上的科学工作流程。 TDMS通过在分层存储体系结构上提供有效的数据共享来防止重复数据移动。针对通用工作流访问模式的定制数据管理使用户可以充分利用不同存储层的优势。扩展的应用程序界面支持用户定义的数据管理策略,从而增强了其处理各种存储体系结构和应用程序场景的能力。此外,我们提出了一个数据感知任务调度模块,以在计算节点上启动任务,从而可以最大程度地利用所需数据的数据局部性。我们构建原型并将其部署在典型的HPC系统上。我们使用实际的工作流程评估了TDMS的性能,实验表明,与Lustre文件系统相比,TDMS可以优化I / O性能,并为数据密集型工作流程提供高达1.54倍的加速。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号