首页> 外文会议>IEEE High Performance Extreme Computing Conference >Exploiting GPU Direct Access to Non-Volatile Memory to Accelerate Big Data Processing
【24h】

Exploiting GPU Direct Access to Non-Volatile Memory to Accelerate Big Data Processing

机译:利用GPU直接访问非易失性内存以加速大数据处理

获取原文

摘要

The amount of data being collected for analysis is growing at an exponential rate. Along with this growth comes increasing necessity for computation and storage. Researchers are addressing these needs by building heterogeneous clusters with CPUs and computational accelerators such as GPUs equipped with high I/O bandwidth storage devices. One of the main bottlenecks of such heterogeneous systems is the data transfer bandwidth to GPUs when running I/O intensive applications. The traditional approach gets data from storage to the host memory and then transfers it to the GPU, which can limit data throughput and processing and thus degrade the end-to-end performance. In this paper, we propose a new framework to address the above issue by exploiting Peer-to-Peer Direct Memory Access to allow GPU direct access of the storage device and thus enhance the performance for parallel data processing applications in a heterogeneous big-data platform. Our heterogeneous cluster is supplied with CPUs and GPUs as computing resources and Non-Volatile Memory express (NVMe) drives as storage resources. We deploy an Apache Spark platform to execute representative data processing workloads over this heterogeneous cluster and then adopt Peer-to-Peer Direct Memory Access to connect GPUs to non-volatile storage directly to optimize the GPU data access. Experimental results reveal that this heterogeneous Spark platform successfully bypasses the host memory and enables GPUs to communicate directly to the NVMe drive, thus achieving higher data transfer throughput and improving both data communication time and end-to-end nerformance by 20%.
机译:的数据量被收集用于分析以指数速度增长。伴随着这种增长是增加了对计算和存储的必要性。研究人员正在通过建立与CPU和计算加速器异构集群诸如配备有高I / O带宽的存储设备的GPU解决这些需求。运行I / O密集的应用程序时一个这样的非均相体系的主要瓶颈是数据传输带宽的GPU。传统的方法从存储到主存储器中获取数据,然后将其传送到GPU,它可以限制数据量和处理,从而降低了端 - 端的性能。在本文中,我们提出了一个新的框架,通过利用对等网络直接内存访问,以允许存储设备的GPU直接访问解决上述问题,从而提高用于并行处理数据在异构的大数据平台的应用程序的性能。我们的异构集群与CPU和GPU提供为计算资源和非易失性存储器快速(NVMe)驱动器作为存储资源。我们部署Apache星火平台执行有代表性的数据处理工作在这个异构集群,然后通过对等网络直接内存访问来连接GPU的非易失性存储器直接优化GPU数据存取。实验结果表明,此异构火花平台成功绕过主机存储器和允许图形处理器以直接传送到NVMe驱动器,从而实现了可以通过更高的数据传输,并通过20%同时改善数据通信时间和结束到终端nerformance。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号