Fast Parallel Sorting Algorithms on GPUs

Bilal Jan; Bartolomeo Montrucchio; Carlo Ragusa

首页> 外文期刊>International Journal of Distributed and Parallel Systems >Fast Parallel Sorting Algorithms on GPUs

【24h】

Fast Parallel Sorting Algorithms on GPUs

机译：GPU上的快速并行排序算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a comparative analysis of the three widely used parallel sorting algorithms: Odd- Even sort, Rank sort and Bitonic sort in terms of sorting rate, sorting time and speed-up on CPU and different GPU architectures. Alongside we have implemented novel parallel algorithm: min-max butterfly network, for finding minimum and maximum in large data sets. All algorithms have been implemented exploiting data parallelism model, for achieving high performance, as available on multi-core GPUs using the OpenCL specification. Our results depicts minimum speed-up19x of bitonic sort against oddeven sorting technique for small queue sizes on CPU and maximum of 2300x speed-up for very large queue sizes on Nvidia Quadro 6000 GPU architecture. Our implementation of full-butterfly network sorting results in relatively better performance than all of the three sorting techniques: bitonic, odd-even and rank sort. For min-max butterfly network, our findings report high speed-up of Nvidia quadro 6000 GPU for high data set size reaching 224 with much lower sorting time.

机译：本文对三种广泛使用的并行排序算法进行了比较分析：奇数-偶数排序，秩排序和Bitonic排序，其排序速率，排序时间和CPU和不同GPU架构上的加速方面均如此。除此以外，我们还实现了新颖的并行算法：最小-最大蝶形网络，用于在大型数据集中查找最小值和最大值。使用OpenCL规范在多核GPU上可以利用数据并行性模型实现所有算法，以实现高性能。我们的结果描述了在CPU上小的队列大小时，bitonic排序的最小速度提高了19倍，而在偶数排序技术下，对于Nvidia Quadro 6000 GPU架构上的非常大的队列来说，最大速度提高了2300x。我们对全蝶形网络进行排序的结果比双排序，奇偶和秩排序这三种排序技术都具有相对更好的性能。对于最小-最大蝶形网络，我们的研究结果报告称，Nvidia Quadro 6000 GPU的高速运行，可实现高达224个数据集，而排序时间却短得多。

著录项

来源
《International Journal of Distributed and Parallel Systems》 |2012年第6期|共页
作者
Bilal Jan; Bartolomeo Montrucchio; Carlo Ragusa;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类 TP311.13;
关键词

相似文献

外文文献
中文文献
专利

1. Fast Parallel Gpu-sorting Using A Hybrid Algorithm [J] . Erik Sintorn, Ulf Assarsson Journal of Parallel and Distributed Computing . 2008,第10期

机译：使用混合算法的快速并行Gpu排序
2. Fast Four-Way Parallel Radix Sorting on GPUs [J] . Linh Ha, Jens Kr¨uger, Claudio T. Silva Computer Graphics Forum: Journal of the European Association for Computer Graphics . 2010,第8期

机译：GPU上的快速四向并行基数排序
3. Faster, more accurate, parallelized inversion for shape optimization in electroheat problems on a graphics processing unit (GPU) with the real-coded genetic algorithm [J] . Victor U. Karthik, Sivamayam Sivasuthan, Arunasalam Rahunanthan, Compel . 2015,第1期

机译：更快，更准确，并行化的逆运算，用于通过实编码遗传算法在图形处理单元（GPU）上优化电热问题中的形状
4. Parallelization of bitonic sort and radix sort algorithms on many core GPUs [C] . Yildiz Zehra, Aydin Musa, Yilmaz Guray International Conference on Electronics, Computer and Computation . 2013

机译：许多核心GPU上的bionic分类和基数分类算法的并行化
5. Parallelization of Genetic Algorithm to Solve MAX-3SAT Problem on GPUs [D] . Shivram, Prakruthi. 2019

机译：遗传算法解决GPU上最大3SAT问题的遗传算法
6. A sample implementation for parallelizing Divide-and-Conquer algorithms on the GPU [O] . Gang Mei, Jiayin Zhang, Nengxiong Xu, 2018

机译：在GPU上并行化分而治之算法的示例实现
7. FAST PARALLEL SORTING ALGORITHMS ON GPUS [O] . Bilal Jan, Bartolomeo Montrucchio, Carlo Ragusa, 2014

机译：GPU上的快速并行排序算法

Fast Parallel Sorting Algorithms on GPUs

摘要

著录项

相似文献

相关主题

期刊订阅