首页> 外文会议>2010 IEEE International Symposium on Parallel amp; Distributed Processing (IPDPS) >Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data
【24h】

Reconciling scratch space consumption, exposure, and volatility to achieve timely staging of job input data

机译:协调暂存空间的消耗,暴露和波动,以实现工作输入数据的及时登台

获取原文
获取原文并翻译 | 示例

摘要

Innovative scientific applications and emerging dense data sources are creating a data deluge for highend computing systems. Processing such large input data typically involves copying (or staging) onto the supercomputer's specialized high-speed storage, scratch space, for sustained high I/O throughput. The current practice of conservatively staging data as early as possible makes the data vulnerable to storage failures, which may entail re-staging and consequently reduced job throughput. To address this, we present a timely staging framework that uses a combination of job startup time predictions, user-specified intermediate nodes, and decentralized data delivery to coincide input data staging with job start-up. By delaying staging to when it is necessary, the exposure to failures and its effects can be reduced. Evaluation using both PlanetLab and simulations based on three years of Jaguar (No. 1 in Top500) job logs show as much as 85.9% reduction in staging times compared to direct transfers, 75.2% reduction in wait time on scratch, and 2.4% reduction in usage/hour.
机译:创新的科学应用和新兴的密集数据源正在为高端计算系统创建大量数据。处理如此大的输入数据通常需要将复制(或暂存)到超级计算机的专用高速存储(暂存空间)上,以保持较高的I / O吞吐量。当前的尽早保守地​​存储数据的做法使数据容易受到存储故障的影响,这可能需要重新存储并因此降低作业吞吐量。为了解决这个问题,我们提出了一个及时的登台框架,该框架结合了作业启动时间预测,用户指定的中间节点和分散式数据传递的组合,以使输入数据登台与作业启动相一致。通过将升级延迟到必要时,可以减少发生故障的可能性及其影响。使用PlanetLab和基于三年Jaguar(在Top500中排名第一)工作日志的模拟进行的评估显示,与直接转移相比,暂存时间最多减少了85.9%,暂存的等待时间减少了75.2%,而直接转移的等待时间减少了2.4%。用量/小时。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号