首页> 外文会议>High performance computing >Machine Learning Using Virtualized GPUs in Cloud Environments
【24h】

Machine Learning Using Virtualized GPUs in Cloud Environments

机译:在云环境中使用虚拟GPU进行机器学习

获取原文
获取原文并翻译 | 示例

摘要

Using graphic processing units (GPU) to accelerate machine learning applications has become a focus of high performance computing (HPC) in recent years. In cloud environments, many different cloud-based GPU solutions have been introduced to seamlessly and securely use GPU resources without sacrificing their performance benefits. Among them are two main approaches: using direct pass-through technologies available on hypervisors and using virtual GPU technologies introduced by GPU vendors. In this paper, we present a performance study of these two GPU virtualization solutions for machine learning in the cloud. We evaluate the advantages and disadvantages of each solution and introduce new findings of their performance impact on machine learning applications in different real-world use-case scenarios. We also examine the benefits of virtual GPUs for machine learning alone and for machine learning applications running together with other GPU-based applications like 3D-graphics on the same server with multiple GPUs to better leverage computing resources. Based on our experimental results benchmarking machine learning applications developed with TensorFlow, we discuss the scaling from one to multiple GPUs and compare the performance between two virtual GPU solutions. Finally, we show that mixing machine learning and other GPU-based workloads can help to reduce combined execution time as compared to running these workloads sequentially.
机译:近年来,使用图形处理单元(GPU)加速机器学习应用已成为高性能计算(HPC)的重点。在云环境中,已引入了许多不同的基于云的GPU解决方案,以在不牺牲性能优势的情况下无缝安全地使用GPU资源。其中有两种主要方法:使用虚拟机管理程序上可用的直接传递技术以及使用GPU供应商引入的虚拟GPU技术。在本文中,我们对这两种用于云中机器学习的GPU虚拟化解决方案进行了性能研究。我们评估了每种解决方案的优缺点,并介绍了它们在不同的实际用例场景中对机器学习应用程序的性能影响的新发现。我们还将研究虚拟GPU的优势,这些优势不仅可用于单独的机器学习,还可用于与其他基于GPU的应用程序一起运行的机器学习应用程序(例如在具有多个GPU的同一服务器上的3D图形)一起更好地利用计算资源。基于我们使用TensorFlow开发的基准测试机器学习应用程序的实验结果,我们讨论了从一个GPU扩展到多个GPU的情况,并比较了两个虚拟GPU解决方案之间的性能。最后,我们证明,与顺序运行这些工作负载相比,将机器学习和其他基于GPU的工作负载混合在一起可以帮助减少合并的执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号