dOCAL: high-level distributed programming with OpenCL and CUDA

Rasch Ari; Bigge Julian; Wrodarczyk Martin; Schulze Richard; Gorlatch Sergei

首页> 外文期刊>Journal of supercomputing >dOCAL: high-level distributed programming with OpenCL and CUDA

【24h】

dOCAL: high-level distributed programming with OpenCL and CUDA

机译：Docal：具有OpenCL和CUDA的高级分布式编程

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In the state-of-the-art parallel programming approaches OpenCL and CUDA, so-called host code is required for program's execution. Efficiently implementing host code is often a cumbersome task, especially when executing OpenCL and CUDA programs on systems with multiple nodes, each comprising different devices, e.g., multi-core CPU and graphics processing units; the programmer is responsible for explicitly managing node's and device's memory, synchronizing computations with data transfers between devices of potentially different nodes and for optimizing data transfers between devices' memories and nodes' main memories, e.g., by using pinned main memory for accelerating data transfers and overlapping the transfers with computations. We develop distributed OpenCL/CUDA abstraction layer (dOCAL)-a novel high-level C++ library that simplifies the development of host code. dOCAL combines major advantages over the state-of-the-art high-level approaches: (1) it simplifies implementing both OpenCL and CUDA host code by providing a simple-to-use, high-level abstraction API; (2) it supports executing arbitrary OpenCL and CUDA programs; (3) it allows conveniently targeting the devices of different nodes by automatically managing node-to-node communications; (4) it simplifies implementing data transfer optimizations by providing different, specially allocated memory regions, e.g., pinned main memory for overlapping data transfers with computations; (5) it optimizes memory management by automatically avoiding unnecessary data transfers; (6) it enables interoperability between OpenCL and CUDA host code for systems with devices from different vendors. Our experiments show that dOCAL significantly simplifies the development of host code for heterogeneous and distributed systems, with a low runtime overhead.

机译：在最先进的并行编程方法OpenCL和CUDA中，程序的执行需要所谓的主代码。有效地实现主机代码通常是一个繁琐的任务，特别是在执行具有多个节点的系统上的OpenCL和CUDA程序时，每个功能包括不同的设备，例如多核CPU和图形处理单元;程序员负责明确地管理节点和设备的内存，同步具有潜在节点的设备之间的数据传输的计算，并用于优化设备存储器和节点之间的数据传输，例如，通过使用固定的主存储器加速数据传输和与计算重叠转移。我们开发分布式OpenCl / CUDA抽象层（Docal）-A新型高级C ++库，简化了主机代码的开发。 Docal结合了最先进的高级方法：（1）它通过提供简单使用的高级抽象API来简化实现OpenCL和CUDA主机代码; （2）它支持执行任意OpenCL和CUDA计划; （3）通过自动管理节点到节点通信，它允许方便地定位不同节点的设备; （4）简化了通过提供不同，特殊分配的存储区，例如固定主存储器来实现数据传输优化，用于将数据传输与计算重叠; （5）它通过自动避免不必要的数据传输来优化内存管理; （6）它可以实现来自不同供应商的设备的OpenCL和CUDA主机代码之间的互操作性。我们的实验表明，Docal显着简化了异构和分布式系统的主机代码的开发，具有低运行时开销。

著录项

来源
《Journal of supercomputing》 |2020年第7期|5117-5138|共22页
作者
Rasch Ari; Bigge Julian; Wrodarczyk Martin; Schulze Richard; Gorlatch Sergei;
展开▼
作者单位

Univ Munster Dept Math & Comp Sci Munster Germany;

Univ Munster Dept Math & Comp Sci Munster Germany;

Univ Munster Dept Math & Comp Sci Munster Germany;

Univ Munster Dept Math & Comp Sci Munster Germany;

Univ Munster Dept Math & Comp Sci Munster Germany;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
OpenCL; CUDA; Host code; Distributed system; Heterogenous system; Interoperability; Data transfer optimization;

机译：OpenCL;CUDA;主机代码;分布式系统;异构系统;互操作性;数据传输优化;

相似文献

外文文献
中文文献
专利

1. GPUBIocks: GUI Programming Tool for CUDA and OpenCL [J] . Hwang Yuan-Shin, Lin Hsih-Hsin, Pai Shen-Hung, Journal of signal processing systems for signal, image, and video technology . 2019,第3a4期

机译：GPUBIocks：用于CUDA和OpenCL的GUI编程工具
2. Programming cuda and opencl: A case study using modern c++ libraries [J] . Demidov D., Ahnert K., Rupp K., SIAM Journal on Scientific Computing . 2013,第5期

机译：编程cuda和opencl：使用现代c ++库的案例研究
3. From CUDA to OpenCL: Towards a performance-portable solution for multi-platform GPU programming [J] . Peng Du, Rick Weber, Piotr Luszczek, Parallel Computing . 2012,第8期

机译：从CUDA到OpenCL：迈向性能便携式的多平台GPU编程解决方案
4. OCAL: An Abstraction for Host-Code Programming with OpenCL and CUDA [C] . Ari Rasch, Martin Wrodarczyk, Richard Schulze, IEEE International Conference on Parallel and Distributed Systems . 2018

机译：OCAL：使用OpenCL和CUDA进行主机代码编程的抽象
5. Distributed OpenCL: A platform for Distributed, Heterogeneous Computing for Domain Scientists. [D] . Dillon, William H. 2012

机译：分布式OpenCL：面向领域科学家的分布式异构计算平台。
6. OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems [O] . John E. Stone, David Gohara, Guochun Shi -1

机译：OpenCL的：一个并行编程标准的异构计算系统
7. Programming CUDA and opencl: A case study using modern C++ libraries [O] . Denis Demidov, Karsten Ahnert, Karl Rupp, 2016

机译：编程CUDa和opencl：使用现代C ++库的案例研究
8. High-Level Fault Tolerance in Distributed Programs [R] . Seligman, E., Beguelin, A. 1994

机译：分布式程序中的高级容错

dOCAL: high-level distributed programming with OpenCL and CUDA

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅