首页> 外文会议>IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing >Multiverse: Dynamic VM Provisioning for Virtualized High Performance Computing Clusters
【24h】

Multiverse: Dynamic VM Provisioning for Virtualized High Performance Computing Clusters

机译:Multiverse:用于虚拟化高性能计算群集的动态VM置备

获取原文

摘要

Traditionally, HPC workloads have been deployed in bare-metal clusters; but the advances in virtualization have led the pathway for these workloads to be deployed in virtualized clusters. However, HPC cluster administrators/providers still face challenges in terms of resource elasticity and virtual machine (VM) provisioning at large-scale, due to the lack of coordination between a traditional HPC scheduler and the VM hypervisor (resource management layer). This lack of interaction leads to low cluster utilization and job completion throughput. Furthermore, the VM provisioning delays directly impact the overall performance of jobs in the cluster. Hence, there is a need for effectively provisioning virtualized HPC clusters, which can best-utilize the physical hardware with minimal provisioning overheads.Towards this, we propose Multiverse, a VM provisioning framework, which can dynamically spawn VMs for incoming jobs in a virtualized HPC cluster, by integrating the HPC scheduler along with VM resource manager. We have implemented this framework on the Slurm scheduler along with the vSphere VM resource manager. In order to reduce the VM provisioning overheads, we use instant cloning which shares both the disk and memory with the parent VM, when compared to full VM cloning which has to boot-up a new VM from scratch. Measurements with real-world HPC workloads demonstrate that, instant cloning is 2.5× faster than full cloning in terms of VM provisioning time. Further, it improves resource utilization by up to 40%, and cluster throughput by up to 1.5×, when compared to full clone for bursty job arrival scenarios.
机译:传统上,HPC工作负载已部署在裸机群集中。但是虚拟化技术的进步为将这些工作负载部署在虚拟化群集中提供了途径。但是,由于传统的HPC调度程序和VM虚拟机管理程序(资源管理层)之间缺乏协调,因此HPC集群管理员/提供商仍面临大规模的资源弹性和虚拟机(VM)供应方面的挑战。缺少交互会导致群集利用率和作业完成吞吐量降低。此外,VM供应延迟直接影响群集中作业的整体性能。因此,需要有效地配置虚拟化的HPC集群,以最小的配置开销来最佳利用物理硬件。为此,我们提出了Multiverse,一种VM部署框架,该框架可以为虚拟化的HPC中的传入作业动态生成VM通过将HPC调度程序与VM资源管理器集成在一起来实现集群。我们已经在Slurm调度程序以及vSphere VM资源管理器上实现了该框架。为了减少VM供应的开销,与必须从头启动新VM的完整VM克隆相比,我们使用即时克隆与父VM共享磁盘和内存。对实际HPC工作负载的测量表明,就虚拟机供应时间而言,即时克隆比完全克隆快2.5倍。此外,与用于突发作业到达场景的完整克隆相比,它可以将资源利用率提高多达40%,并将群集吞吐量提高多达1.5倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号