...
【24h】

Support OpenCL 2.0 Compiler on LLVM for PTX Simulators

机译:在LLVM上为PTX模拟器支持OpenCL 2.0编译器

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Heterogeneous systems that consist of multiple CPUs and GPUs for high-performance computing are becoming increasingly popular, and OpenCL (Open Computing Language) provides a framework for writing programs that can be executed across heterogeneous devices. Compared with OpenCL 1.2, the new features of OpenCL 2.0 provide developers with better expressive power for programming heterogeneous computing environments. Currently, gem5-gpu, which includes gem5 and GPGPU-Sim, can offer an experimental simulation environment for OpenCL. In gem5-gpu, gem5 only supports CUDA, although GPGPU-Sim can support OpenCL by compiling an OpenCL kernel code to PTX code using real GPU drivers. However, this compilation flow in GPGPU-Sim can only support up to OpenCL 1.2. OpenCL 2.0 provides new features such as workgroup built-in functions, extended atomic built-in functions, and device-side enqueue. To support OpenCL 2.0, the compiler must be extended to enable the compilation of OpenCL 2.0 kernel code to PTX code. In this paper, the proposed compiler is modified from the low level virtual machine (LLVM) compiler to extend such features to enhance the emulator to support OpenCL 2.0. The proposed compiler creates local buffers for each workgroup to enable workgroup built-in functions and adds atomic built-in functions with memory order and memory scope for OpenCL 2.0 in NVPTX. Furthermore, the APIs available in CUDA are utilized to implement the OpenCL 2.0 device-side enqueue kernel and compilation schemes in Clang are revised. The AMD APP SDK 3.0 and NTU OpenCL benchmarks are used to verify that the proposed compiler can support the features of OpenCL 2.0.
机译:由多个CPU和GPU组成的用于高性能计算的异构系统正变得越来越流行,OpenCL(开放计算语言)提供了一个框架,可用于编写可在异构设备上执行的程序。与OpenCL 1.2相比,OpenCL 2.0的新功能为开发人员提供了对异构计算环境进行编程的更好的表达能力。目前,包含gem5和GPGPU-Sim的gem5-gpu可以为OpenCL提供实验性的仿真环境。在gem5-gpu中,gem5仅支持CUDA,尽管GPGPU-Sim可以通过使用真正的GPU驱动程序将OpenCL内核代码编译为PTX代码来支持OpenCL。但是,GPGPU-Sim中的此编译流程最多只能支持OpenCL 1.2。 OpenCL 2.0提供了新功能,例如工作组内置功能,扩展的原子内置功能和设备端入队。为了支持OpenCL 2.0,必须扩展编译器以启用将OpenCL 2.0内核代码编译为PTX代码。在本文中,从低层虚拟机(LLVM)编译器修改了所建议的编译器,以扩展此类功能,以增强仿真器以支持OpenCL 2.0。拟议的编译器为每个工作组创建本地缓冲区以启用工作组内置功能,并为NVPTX中的OpenCL 2.0添加原子内置功能以及内存顺序和内存范围。此外,CUDA中可用的API用于实现OpenCL 2.0设备端队列内核,并修改了Clang中的编译方案。 AMD APP SDK 3.0和NTU OpenCL基准用于验证建议的编译器可以支持OpenCL 2.0的功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号