GPU accelerate parallel Odd-Even merge sort: An OpenCL method

机译：GPU加速并行奇偶合并合并：OpenCL方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Odd-Even merge sort is a basic problem in computer supported cooperative work in design area. However, it is not effective because of the high complexity O(nlg²n) in CPU platform. In this paper, we present a novel implementation based on the OpenCL programming model on recent GPU (Graphic Processing Unit). Our implementation was based on Knuth''s algorithm and do some change. Due to limitations of OpenCL, we utilize a flag variable to make it avoid the direct backward control flow. As results, our implementation achieves 18× speedups compared with the CPU C++ STL quick sort. And it gets almost linear speedup for next generations of GPU because of the complete parallelism in each iteration process. Meanwhile, our approach makes the odd-even merge sort effectively in practice because of the high performance. Furthermore, the approach used in this paper for cooperating thousands of processing units to parallel process can also be used in other cooperation areas.

机译：奇偶合并排序是设计领域中计算机支持的协作工作中的一个基本问题。但是，由于CPU平台的复杂度O（nlg ^{2 n）高，因此无效。在本文中，我们提出了一种基于OpenCL编程模型的最新实现，该模型在最近的GPU（图形处理单元）上实现。我们的实现基于Knuth的算法并进行了一些更改。由于OpenCL的限制，我们利用标志变量来避免直接向后控制流。结果，与CPU C ++ STL快速排序相比，我们的实现实现了18倍的加速。由于每个迭代过程都具有完全的并行性，因此它对于下一代GPU几乎实现了线性加速。同时，由于其高性能，我们的方法在实践中有效地进行了奇偶合并排序。此外，本文中用于将成千上万个处理单元进行并行处理的方法也可以用于其他合作领域。}

著录项

来源
《2011 15th International Conference on Computer Supported Cooperative Work in Design》|2011年|p.76-83|共8页
会议地点
作者
Zhang Keliang; Li Jiajia; Chen Gang; Wu Baifeng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.72;
关键词
GPGPU; GPU; Odd-Even Merge Sort; OpenCL;

机译：GPGPU; GPU;奇偶合并排序; OpenCL;

相似文献

外文文献
中文文献
专利

1. Kepler GPU accelerated recursive sorting using dynamic parallelism [J] . B. Neelima, Bharath Shamsundar, Anjjan Narayan, Concurrency, practice and experience . 2017,第4期

机译：开普勒GPU使用动态并行性加速递归排序
2. Kepler GPU accelerated recursive sorting using dynamic parallelism [J] . Neelima B., Shamsundar Bharath, Narayan Anjjan, Theoretical and Experimental Plant Physiology . 2017,第4期

机译：开普勒GPU使用动态并行性加速递归排序
3. GPGPU-accelerated Parallelization Practice and Analysis for Image Segmentation Methods [J] . Guo He, Wang Yu-Xin, Feng Zhen, Information Technology Journal . 2013,第20期

机译：GPGPU加速并行化实践及图像分割方法分析
4. GPU accelerate parallel Odd-Even merge sort: An OpenCL method [C] . Zhang Keliang, Li Jiajia, Chen Gang, International Conference on Computer Supported Cooperative Work in Design . 2011

机译：GPU加速并行奇数偶数合并排序：OpenCL方法
5. Accelerating discontinuous Galerkin method and finite difference method by using multiple GPUs with CUDA. [D] . Mu, Dawei. 2015

机译：通过使用带有CUDA的多个GPU来加速不连续的Galerkin方法和有限差分方法。
6. Accelerated cryo-EM structure determination with parallelisation using GPUs in RELION-2 [O] . Dari Kimanius, Björn O Forsberg, Sjors HW Scheres, 2016

机译：使用RELION-2中的GPU通过并行化加速低温电磁结构确定
7. An Efficient Implementation of Batcher's Odd-Even Merge Algorithm and Its Application in Parallel Sorting Schemes [O] . 1983

机译：Batcher的奇数偶数合并算法的有效实现及其在并行排序方案中的应用

GPU accelerate parallel Odd-Even merge sort: An OpenCL method

摘要

著录项

相似文献

相关主题

期刊订阅