首页> 外文会议>International Workshop on Embedded Multicore Systems >OpenCL 2.0 Compiler Adaptation on LLVM for PTX Simulators
【24h】

OpenCL 2.0 Compiler Adaptation on LLVM for PTX Simulators

机译:用于PTX模拟器的LLVM上的OpenCL 2.0编译器适应

获取原文

摘要

OpenCL continues to gather momentum on both desktop and mobile devices. The new features of OpenCL 2.0 provides developers better expressive power in programming heterogeneous computing environments. Currently in the experimental simulation environment, gem5-gpu only supports CUDA, but GPGPU-Sim can support OpenCL by compiling OpenCL kernel code to PTX using real GPU driver. However, this driver compilation in GPGPU-Sim only can support up to OpenCL 1.2. To support OpenCL 2.0, it is necessary to extend the compiler to enable the compilation of OpenCL 2.0 kernel code to PTX. In this paper, our experience in enabling the compiler flow is reported. In OpenCL 2.0, it provides new features such as dynamic parallelism, work-group built-in functions, extend atomic built-in functions, and so on. The proposed compiler that is modified from Low Level Virtual Machine (LLVM) extends such features for enhancing the emulator to support OpenCL 2.0. After the compiler is modified, it can support dynamic parallelism, workgroup built-in functions and extend atomic built-in functions. Using existing dynamic parallelism APIs in CUDA to implement OpenCL 2.0 enqueue kernel and revise compilation scheme in clang. Furthermore, the proposed compiler also creates local buffers for each work group to use for work-group built-in functions, and adds atomic built-in functions with memory order and memory scope for OpenCL 2.0 in NVPTX. From benchmarks, the proposed compiler can support the claim target.
机译:OpenCL继续在桌面和移动设备上收集势头。 OpenCL 2.0的新功能在编程异构计算环境中为开发人员提供了更好的表现力。目前在实验模拟环境中,GEM5-GPU仅支持CUDA,但GPGPU-SIM可以通过将OpenCL内核代码编译到PTX使用真正的GPU驱动程序来支持OpenCL。但是,GPGPU-SIM中的此驱动程序编译只能支持OpenCL 1.2。要支持OpenCL 2.0,有必要扩展编译器以使opencl 2.0内核代码的编译为ptx。在本文中,我们报告了我们在启用编译流程方面的经验。在OpenCL 2.0中,它提供了新功能,如动态并行,工作组内置功能,扩展原子内置功能等。从低级虚拟机(LLVM)修改的所提出的编译器扩展了用于增强仿真器以支持OpenCL 2.0的特征。编译器被修改后,它可以支持动态并行性,工作组内置函数并扩展原子内置功能。在CUDA中使用现有的动态并行性API实现OpenCL 2.0 inqueue内核和Clang的修改编译方案。此外,建议的编译器还为每个工作组创建用于工作组内置函数的本地缓冲区,并在NVPTX中添加具有内存顺序和内存范围的原子内置函数。从基准开始,所提出的编译器可以支持索赔目标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号