首页> 外文期刊>Journal of supercomputing >WatCache: a workload-aware temporary cache on the compute side of HPC systems
【24h】

WatCache: a workload-aware temporary cache on the compute side of HPC systems

机译:WatCache:HPC系统的计算端的工作负载感知临时缓存

获取原文
获取原文并翻译 | 示例
           

摘要

As the computing power of high-performance computing (HPC) systems is developing to exascale, the storage systems are stretched to their limits to process the growing I/O traffic. Researchers are building storage systems on top of compute node-local fast storage devices (such as NVMe SSD) to alleviate the I/O bottleneck. However, user jobs have varying requirements of I/O bandwidth; therefore, it is a serious waste of expensive storage devices to have them on all compute nodes and build them into a global storage system. In addition, current node-local storage systems need to cope with the challenging small I/O and rank 0 I/O pattern from HPC workloads. In this paper, we presented a workload-aware temporary cache (WatCache) to meet above challenges. We designed a workload-aware node allocation method to allocate fast storage devices to jobs according to their I/O requirements and merged the devices of the jobs into separate temporary cache spaces. We implemented a metadata caching strategy that reduces the metadata overhead of I/O requests to improve the performance of small I/O. We designed a data layout strategy that distributes consecutive data that exceeds a threshold to multiple devices to achieve higher aggregate bandwidth for rank 0 I/O. Through extensive tests with several I/O benchmarks and applications, we have validated that WatCache offers linearly scalable performance, and brings significant performance promotions to small I/O and rank 0 I/O patterns.
机译:随着高性能计算(HPC)系统的计算能力发展到万亿级,存储系统已扩展到其极限,以处理不断增长的I / O流量。研究人员正在本地计算节点快速存储设备(例如NVMe SSD)上构建存储系统,以缓解I / O瓶颈。但是,用户作业对I / O带宽的要求不同。因此,将昂贵的存储设备放置在所有计算节点上并将它们构建到全局存储系统中是严重的浪费。另外,当前的节点本地存储系统需要应对具有挑战性的小型I / O,并需要从HPC工作负载中将I / O等级定为0。在本文中,我们提出了一种工作负载感知的临时缓存(WatCache),以解决上述挑战。我们设计了一种工作负载感知节点分配方法,根据其I / O要求将快速存储设备分配给作业,然后将这些作业的设备合并到单独的临时缓存空间中。我们实施了元数据缓存策略,该策略可减少I / O请求的元数据开销,从而提高小型I / O的性能。我们设计了一种数据布局策略,该策略将超过阈值的连续数据分配给多个设备,以实现等级0 I / O的更高聚合带宽。通过使用多个I / O基准和应用程序进行的广泛测试,我们验证了WatCache提供了线性可扩展的性能,并为小型I / O和等级0的I / O模式带来了显着的性能提升。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号