【24h】

Efficient Data Communication between CPU and GPU through Transparent Partial-Page Migration

机译:通过透明的部分页面迁移实现CPU和GPU之间的高效数据通信

获取原文

摘要

Despite the increasing investment in integrated GPUs and next-generation interconnect research, discrete GPUs connected by PCI Express still account for the dominant position of the market, the management of data communication between CPU and GPU continues to evolve. Initially, the programmer controls the data transfer between CPU and GPU explicitly. To simplify programming and enable system-wide atomic memory operations, GPU vendors have developed a programming model that provides a single virtual address space. The page migration engine in this model migrates pages between CPU and GPU on demand automatically. To meet the needs of high-performance workloads, the page size tends to be larger. Limited by low bandwidth and high latency interconnects, larger page migration has longer delay, which may reduce the overlap of computation and transmission and cause serious performance decline. In this paper, we propose partial-page migration that only migrates the requested part of a page to shorten the migration latency and avoid the performance degradation of the whole-page migration when the page becomes larger. Experiments show that partial-page migration is possible to significantly hide the performance overheads of whole-page migration when the page size is 2MB and the PCI Express bandwidth is 16GB/sec, converting an average 72.72× slowdown to a 1.29× speedup when compared with programmers controlled data transmission. Additionally, we examine the impact of page size on TLB miss rate and the performance impact of migration unit size on execution time, enabling designers to make informed decisions.
机译:尽管在集成GPU和下一代互连研究方面的投资不断增加,但通过PCI Express连接的离散GPU仍占据着市场的主导地位,CPU和GPU之间的数据通信管理仍在不断发展。最初,程序员明确控制CPU和GPU之间的数据传输。为了简化编程并实现系统范围的原子内存操作,GPU供应商开发了一种编程模型,该模型提供了单个虚拟地址空间。此模型中的页面迁移引擎会根据需要自动在CPU和GPU之间迁移页面。为了满足高性能工作负载的需求,页面大小趋于更大。受低带宽和高延迟互连的限制,较大的页面迁移将具有较长的延迟,这可能会减少计算和传输的重叠并导致严重的性能下降。在本文中,我们提出了部分页面迁移,该部分页面迁移仅迁移页面的请求部分,以缩短迁移延迟并避免当页面变大时整个页面迁移的性能下降。实验表明,当页面大小为2MB且PCI Express带宽为16GB / sec时,部分页面迁移可能会显着隐藏整个页面迁移的性能开销,与之相比,平均速度降低了72.72倍,从而提高了1.29倍。程序员控制数据传输。此外,我们研究了页面大小对TLB丢失率的影响以及迁移单元大小对执行时间的性能影响,从而使设计人员能够做出明智的决策。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号