首页> 外文期刊>ACM transactions on computer systems >SPIN: Seamless Operating System Integration of Peer-to-Peer DMA Between SSDs and CPUs
【24h】

SPIN: Seamless Operating System Integration of Peer-to-Peer DMA Between SSDs and CPUs

机译:SPIN:SSD和CPU之间的对等DMA无缝操作系统集成

获取原文
获取原文并翻译 | 示例

摘要

Recent GPUs enable Peer-to-Peer Direct Memory Access (p2p) from fast peripheral devices like NVMe SSDs to exclude the CPU from the data path between them for efficiency. Unfortunately, using p2p to access files is challenging because of the subtleties of low-level non-standard interfaces, which bypass the OS file I/O layers and may hurt system performance. Developers must possess intimate knowledge of low-level interfaces to manually handle the subtleties of data consistency and misaligned accesses.We present SPIN, which integrates p2p into the standard OS file I/O stack, dynamically activating p2p where appropriate, transparently to the user. It combines p2p with page cache accesses, re-enables read-ahead for sequential reads, all while maintaining standard POSIX FS consistency, portability across GPUs and SSDs, and compatibility with virtual block devices such as software RAID.We evaluate SPIN on NVIDIA and AMD GPUs using standard file I/O benchmarks, application traces, and end-to-end experiments. SPIN achieves significant performance speedups across a wide range of workloads, exceeding p2p throughput by up to an order of magnitude. It also boosts the performance of an aerial imagery rendering application by 2.6x by dynamically adapting to its input-dependent file access pattern, enables 3.3x higher throughput for a GPU-accelerated log server, and enables 29% faster execution for the highly optimized GPU-accelerated image collage with only 30 changed lines of code.
机译:最近的GPU使来自NVMe SSD等快速外围设备的对等直接内存访问(p2p)能够从它们之间的数据路径中排除CPU,从而提高效率。不幸的是,由于低级非标准接口的微妙之处,使用p2p来访问文件具有挑战性,它会绕过OS文件I / O层并可能损害系统性能。开发人员必须对底层接口有深入的了解,才能手动处理数据一致性和未对齐访问的微妙问题。我们提出了SPIN,它将p2p集成到标准OS文件I / O堆栈中,并在适当的情况下对用户透明地动态激活p2p。它结合了p2p和页面缓存访问,重新启用预读以进行顺序读取,同时保持标准POSIX FS一致性,在GPU和SSD上的可移植性以及与虚拟块设备(如软件RAID)的兼容性。我们在NVIDIA和AMD上评估SPIN使用标准文件I / O基准,应用程序跟踪和端到端实验的GPU。 SPIN可在各种工作负载上实现显着的性能提升,超过p2p吞吐量达一个数量级。通过动态适应其依赖于输入的文件访问模式,它还将航空影像渲染应用程序的性能提高了2.6倍,使GPU加速的日志服务器的吞吐量提高了3.3倍,并且高度优化的GPU的执行速度提高了29% -仅需更改30行代码即可加速图像拼贴。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号