首页> 外文期刊>Parallel Computing >Improved probabilistic I/O scheduling for limited-size Burst-Buffers deployed HPC
【24h】

Improved probabilistic I/O scheduling for limited-size Burst-Buffers deployed HPC

机译:改进的有限突发缓冲区的概率I / O调度部署HPC

获取原文
获取原文并翻译 | 示例

摘要

I/O bottleneck is a critical problem in current High Performance Computing (HPC) systems which hinges the performance scalability of a system. Some techniques, such as I/O scheduling and Burst-Buffering, had been proposed to accelerate data exchange between the compute and storage components on HPC platforms. Probabilistic I/O scheduling, a Markov-chain-based hybrid method combined the above-mentioned two techniques, controls the data transmission considering the whole load states of the Burst-Buffers system to mitigate the I/O congestion caused by unpredictable concurrent I/O bursts. However, this method requires a large amount of computation to make online scheduling, resulting in significant wastage of computing resources and decreased efficiency in scheduling. In this paper, we first introduce the architecture of Burst-Buffers deployed HPC platform, the probabilistic execution model of applications, and the basic probabilistic I/O scheduling method with a proof of its efficiency based on the Markov-chain framework. Then, we propose the modularization technique, as the first improvement, to reduce the repeated computation by isolating the heuristic application selection module from the original method and reusing the application ranking result to adjust the I/O scheduling. Next, we propose the thresholding technique, as the second improvement, to reduce the number of data transferring on burst-buffers by considering the write amplification characteristic of the underlying storage devices. Finally, we conduct extensive simulation experiments to show that our proposed I/O scheduling methods outperform the existing I/O scheduling methods without introducing burst-buffers states and without considering the characteristics of storage devices.
机译:I / O瓶颈是当前高性能计算(HPC)系统中的一个关键问题,其涉及系统的性能可扩展性。已经提出了一些技术,例如I / O调度和突发缓冲,以加速HPC平台上的计算和存储组件之间的数据交换。概率I / O调度,基于Markov链的混合方法组合上述两种技术,控制了考虑突发缓冲系统的整个负载状态的数据传输,以减轻由不可预测的并发I /引起的I / O拥塞o爆裂。然而,该方法需要大量的计算来进行在线调度,从而导致计算资源的显着浪费并降低调度效率。在本文中,我们首先介绍了部署了HPC平台的突发缓冲区的架构,应用程序的概率执行模型以及基于Markov链框架的证据证明其效率的基本概率I / O调度方法。然后,我们提出了模块化技术,作为第一改进,通过将启发式应用选择模块与原始方法隔离并重用应用程序排名结果来调整I / O调度来减少重复计算。接下来,我们提出阈值化技术作为第二改进,以通过考虑底层存储设备的写入放大特性来减少在突发缓冲器上传输的数据的数量。最后,我们进行了广泛的仿真实验,以表明我们所提出的I / O调度方法优于现有的I / O调度方法而不引入突发缓冲区状态,而不考虑存储设备的特性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号