首页> 外文会议>International Conference on Information Science, Electronics and Electrical Engineering >Dynamic memory optimization and parallelism management for OpenCL

【24h】

Dynamic memory optimization and parallelism management for OpenCL

机译：OpenCL的动态内存优化和并行管理

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, multiprocessor platforms have become trends for achieving high performance. OpenCL (Open Computing Language) is one of the programming standards for heterogeneous multiprocessors, and provides portability for these platforms. Our research focuses on platforms with CPUs and GPUs since GPUs are now widespread in use. On such a platform, two programming issues may affect the performance on GPU computing significantly. One is the work load distribution and another is the employment of GPU memory hierarchy. To fully utilize the characteristics of GPUs, programmers have to be not only proficient at parallel programming but also familiar with hardware specifications. Therefore, in this paper, we propose a compilation pass to automatically perform optimizations for OpenCL kernels. Our compilation pass will transform an input naïve kernel function with optimizations, including kernel function analysis, work-group rearrangement, memory coalescing, and work-item merge. In addition, our framework is implemented on a runtime system so that it may dynamically adjust the optimizing parameters according to the hardware specifications. Considering the execution time, the optimized kernels generated by our design may have significant performance improvement over the naïve versions. Although the optimizations performed in runtime may incur time overheads, the overheads may be covered by intensive kernel computation or massive input data in most cases.

机译：近来，多处理器平台已成为实现高性能的趋势。 OpenCL（开放计算语言）是异构多处理器的编程标准之一，并为这些平台提供了可移植性。我们的研究集中在具有CPU和GPU的平台上，因为GPU现在已被广泛使用。在这样的平台上，两个编程问题可能会严重影响GPU计算的性能。一个是工作负载分配，另一个是采用GPU内存层次结构。为了充分利用GPU的特性，程序员不仅必须精通并行编程，还必须熟悉硬件规范。因此，在本文中，我们提出了一个编译通道来自动执行OpenCL内核的优化。我们的编译过程将通过优化（包括内核函数分析，工作组重排，内存合并和工作项合并）来优化输入的原始内核函数。此外，我们的框架是在运行时系统上实现的，因此它可以根据硬件规格动态调整优化参数。考虑到执行时间，我们的设计生成的优化内核可能会比纯朴的版本具有显着的性能提升。尽管在运行时执行优化可能会产生时间开销，但是在大多数情况下，开销可能会被密集的内核计算或大量的输入数据所覆盖。

著录项

来源
《International Conference on Information Science, Electronics and Electrical Engineering 》|2014年|776-780|共5页
会议地点
作者
Hsu Chao-Hung; Wu I-Wei; Shann Jean Jyh-Jiun;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Graphics processing units; Kernel; Memory management; Optimization; Parallel processing; Random access memory; Registers; GPU; LLVM; OpenCL; dynamic optimization;

机译：图形处理单元;内核;内存管理;优化;并行处理;随机存取存储器;寄存器; GPU; LLVM; OpenCL;动态优化;

相似文献

外文文献
中文文献
专利

1. A GPGPU compiler for memory optimization and parallelism management [J] . Yang Yi, Xiang Ping, Kong Jingfei, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2010 ,第6期

机译：用于内存优化和并行管理的GPGPU编译器
2. Microarchitecture Optimizations for Exploiting Memory-Level Parallelism [J] . Yuan Chou, Brian Fahs, Santosh Abraham Computer architecture news . 2004 ,第2期

机译：利用内存级并行性的微体系结构优化
3. Exploiting multiple levels of parallelism in Molecular Dynamics based calculations via modern techniques and software paradigms on distributed memory computers [J] . Mark E. Tuckerman, D. A. Yarne, Shane O. Samuelson, Computer physics communications . 2000 ,第1a2期

机译：通过分布式存储计算机上的现代技术和软件范例，在基于分子动力学的计算中利用多级并行性
4. Dynamic memory optimization and parallelism management for OpenCL [C] . Hsu Chao-Hung, Wu I-Wei, Shann Jean Jyh-Jiun International Conference on Information Science, Electronics and Electrical Engineering . 2014

机译：OpenCL的动态内存优化与并行管理
5. Dynamic Parallelism in GPU Optimized Barnes Hut Trees for Molecular Dynamics Simulations [D] . Carranza Zuniga, Melisa 2017

机译：GPU优化的Barnes小屋树中的动态并行性，用于分子动力学仿真
6. Meta-analysis of Protocolized Goal-Directed Hemodynamic Optimization for the Management of Severe Sepsis and Septic Shock in the Emergency Department [O] . Charles R. Wira, Kelly Dodge, John Sather, 2014

机译：协议化目标导向的血流动力学优化对急诊部门严重脓毒症和脓毒性休克的管理的荟萃分析
7. A GPGPU Compiler for Memory Optimization and Parallelism Management [O] . Yi Yang, Ping Xiang, Jingfei Kong, 2010

机译：用于内存优化和并行管理的GpGpU编译器
8. Questions of Dynamic Optimization of the Information Process in a Digital Computer (EtsVM) WITH Highly Developed Parallelism [R] . Petukhov, I. A. 1969

机译：高度发达并行机器人数字计算机（EtsVm）信息处理动态优化问题

Dynamic memory optimization and parallelism management for OpenCL

摘要

著录项

相似文献

相关主题

期刊订阅