...
首页> 外文期刊>Journal of Parallel and Distributed Computing >On the support of inter-node P2P GPU memory copies in rCUDA
【24h】

On the support of inter-node P2P GPU memory copies in rCUDA

机译:关于在rCUDA中支持节点间P2P GPU内存副本

获取原文
获取原文并翻译 | 示例
           

摘要

Although CPUs are being widely adopted in order to noticeably reduce the execution time of many applications, their use presents several side effects such as an increased acquisition cost of the cluster nodes or an increased overall energy consumption. To address these concerns, GPU virtualization frameworks could be used. These frameworks allow accelerated applications to transparently use GPUs located in cluster nodes other than the one executing the program. Furthermore, these frameworks aim to offer the same API as the NVIDIA CUDA Runtime API does, although different frameworks provide different degree of support. In general, and because of the complexity of implementing an efficient mechanism, none of the existing frameworks provides support for memory copies between remote GPUs located in different nodes.In this paper we introduce an efficient mechanism devised for addressing the support for this kind of memory copies among GPUs located in different cluster nodes. Several options are explored and analyzed, such as the use of the GPUDirect RDMA mechanism. We focus our discussion on the rCUDA remote GPU virtualization framework. Results show that is possible to implement this kind of memory copies in such an efficient way that performance is even improved with respect to the original performance attained by CUDA when GPUs located in the same cluster node are leveraged. (C) 2019 Elsevier Inc. All rights reserved.
机译:尽管为了显着减少许多应用程序的执行时间而广泛采用CPU,但是使用它们会带来一些副作用,例如增加群集节点的获取成本或增加总体能耗。为了解决这些问题,可以使用GPU虚拟化框架。这些框架允许加速的应用程序透明地使用位于群集节点(而不是执行该程序的节点)中的GPU。此外,尽管不同的框架提供不同程度的支持,但这些框架旨在提供与NVIDIA CUDA运行时API相同的API。通常,由于实施高效机制的复杂性,现有框架均未提供对位于不同节点的远程GPU之间的内存副本的支持。在本文中,我们介绍了一种旨在解决此类内存支持问题的有效机制。在位于不同群集节点中的GPU之间复制。探索和分析了多个选项,例如使用GPUDirect RDMA机制。我们将讨论重点放在rCUDA远程GPU虚拟化框架上。结果表明,可以以一种有效的方式来实现这种内存副本,以使当利用位于同一群集节点中的GPU时,性能甚至相对于CUDA所获得的原始性能得以提高。 (C)2019 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号