【24h】

Comparing Managed Memory and ATS with and without Prefetching on NVIDIA Volta GPUs

机译:在NVIDIA Volta GPU上进行预取和不预取的情况下比较托管内存和ATS

获取原文

摘要

One of the major differences in many-core versus multicore architectures is the presence of two different memory spaces: a host space and a device space. In the case of NVIDIA GPUs, the device is supplied with data from the host via one of the multiple memory management API calls provided by the CUDA framework, such as CudaMallocManaged and CudaMemCpy. Modern systems, such as the Summit supercomputer, have the capability to avoid the use of CUDA calls for memory management and access the same data on GPU and CPU. This is done via the Address Translation Services (ATS) technology that gives a unified virtual address space for data allocated with malloc and new if there is an NVLink connection between the two memory spaces. In this paper, we perform a deep analysis of the performance achieved when using two types of unified virtual memory addressing: UVM and managed memory.
机译:多核与多核体系结构的主要区别之一是存在两个不同的内存空间:主机空间和设备空间。对于NVIDIA GPU,通过CUDA框架提供的多个内存管理API调用之一(例如CudaMallocManaged和CudaMemCpy)从主机为设备提供数据。诸如Summit超级计算机之类的现代系统具有避免使用CUDA调用进行内存管理并在GPU和CPU上访问相同数据的能力。这是通过地址转换服务(ATS)技术完成的,该技术为两个内存空间之间存在NVLink连接的情况下,为使用malloc和new分配的数据提供了统一的虚拟地址空间。在本文中,我们对使用两种类型的统一虚拟内存寻址(UVM和托管内存)时获得的性能进行了深入的分析。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号