InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

Oden Lena; Froening Holger

首页> 外文期刊>Experimental Mechanics >InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

【24h】

InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

机译：GPU上的InfiniBand动词：从GPU控制InfiniBand网络设备的案例研究

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Due to their massive parallelism and high performance per Watt, GPUs have gained high popularity in high-performance computing and are a strong candidate for future exascale systems. But communication and data transfer in GPU-accelerated systems remain a challenging problem. Since the GPU normally is not able to control a network device, a hybrid-programming model is preferred whereby the GPU is used for calculation and the CPU handles the communication. As a result, communication between distributed GPUs suffers from unnecessary overhead, introduced by switching control flow from GPUs to CPUs and vice versa. Furthermore, often a designated CPU thread is required to control GPU-related communication. In this work, we modify user space libraries and device drivers of GPUs and the InfiniBand network device in a way to enable the GPU to control an InfiniBand network device to independently source and sink communication requests without any involvement of the CPU. Our results show that complex networking protocols such as InfiniBand Verbs are better handled by CPUs, since overhead of work request generation cannot be parallelized and is not suitable for the highly parallel programming model of GPUs. The massive number of instructions and accesses to host memory that is required to source and sink a communication request on the GPU slows down the performance. Only through a massive reduction in the complexity of the InfiniBand protocol can some performance improvements be achieved.

机译：由于它们的大规模并行性和每瓦特的高性能，GPU在高性能计算中获得了很高的知名度，并且是未来百亿亿次级系统的强大候选者。但是，GPU加速系统中的通信和数据传输仍然是一个具有挑战性的问题。由于GPU通常无法控制网络设备，因此首选混合编程模型，其中GPU用于计算，而CPU处理通信。结果，分布式GPU之间的通信遭受不必要的开销，这是由于将控制流从GPU切换到CPU而引入的，反之亦然。此外，通常需要指定的CPU线程来控制与GPU相关的通信。在这项工作中，我们将修改GPU和InfiniBand网络设备的用户空间库和设备驱动程序，以使GPU能够控制InfiniBand网络设备独立地发出和接收通信请求，而无需CPU的参与。我们的结果表明，复杂的网络协议（例如InfiniBand Verbs）可以更好地由CPU处理，因为工作请求生成的开销无法并行化，并且不适合GPU的高度并行编程模型。在GPU上发出和接收通信请求所需的大量指令和对主机内存的访问会降低性能。只有通过大幅降低InfiniBand协议的复杂性，才能实现某些性能改进。

著录项

来源
《Experimental Mechanics》 |2017年第4期|274-284|共11页
作者
Oden Lena; Froening Holger;
展开▼
作者单位

Fraunhofer Inst Ind Math, Competence Ctr High Performance Comp, Fraunhofer Pl 1, D-67663 Kaisersautern, Germany;

Heidelberg Univ, Inst Comp Engn, Mannheim, Germany;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
GPUs; communication and data transfer; heterogeneous clusters; InfiniBand; RDMA; GPU-controlled communication;

机译：GPU;通信和数据传输;异构集群;InfiniBand;RDMA;GPU控制的通信;

相似文献

外文文献
中文文献
专利

1. NVIDIA TESLA GPUS TO COMMUNICATE FASTER OVER MELLANOX INFINIBAND NETWORKS [J] . Desktop engineering . 2010,第5期

机译：NVIDIA TESLA GPU可以通过MELLANOX无限网络进行更快的通信
2. NVIDIA TESLA GPUS TO COMMUNICATE FASTER OVER MELLANOX INFINIBAND NETWORKS [J] . Advanced imaging . 2009,第11期

机译：NVIDIA TESLA GPU可以通过MELLANOX无限网络进行更快的通信
3. MVAPICH2-GPU: optimized GPU to GPU communication for InfiniBand clusters [J] . Hao Wang, Sreeram Potluri, Miao Luo, Computer science . 2011,第3a4期

机译：MVAPICH2-GPU：针对InfiniBand集群优化了GPU与GPU的通信
4. Infiniband-Verbs on GPU: A Case Study of Controlling an Infiniband Network Device from the GPU [C] . Oden L., Froning H., Pfreundt F.-J. IEEE International Parallel Distributed Processing Symposium . 2014

机译：GPU上的Infiniband-动词：从GPU控制Infiniband网络设备的案例研究
5. Kernel Mechanisms for Efficient GPU Accelerated Deep Neural Network Inference on Embedded Devices [D] . Nigam, Hemant. 2018

机译：高效GPU的内核机制加速了对嵌入式设备的深神经网络推断
6. Comparison of perinatal outcomes in facilities before and after Global Network’s Helping Babies Breathe Implementation Study in Nagpur India [O] . Archana Patel, Akash Bang, Kunal Kurhe, 2019

机译：Global Network在印度那格浦尔开展的帮助婴儿呼吸实施研究之前和之后设施中围产期结局的比较
7. Influence of InfiniBand FDR on the Performance of Remote GPU Virtualization [O] . R. Mayo, E. S. Quintana-ortı́, F. Silla, 2015

机译：InfiniBand FDR对远程GpU虚拟化性能的影响

InfiniBand Verbs on GPU: a case study of controlling an InfiniBand network device from the GPU

摘要

著录项

相似文献

相关主题

期刊订阅