【24h】

AA-Sort

机译:AA排序

获取原文
获取原文并翻译 | 示例

摘要

Many sorting algorithms have been studied in the past, but there are only a few algorithms that can effectively exploit both SIMD instructions and thread-level parallelism. In this paper, we propose a new parallel sorting algorithm, called Aligned-Access sort (AA-sort), for shared-memory multi processors. The AA-sort algorithm takes advantage of SIMD instructions. The key to high performance is eliminating unaligned memory accesses that would reduce the effectiveness of SIMD instructions. We implemented and evaluated the AA-sort on PowerPC� 970MP and Cell Broadband EngineTM. In summary, a sequential version of the AA-sort using SIMD instructions outperformed IBM's optimized sequential sorting library by 1.8 times and GPUTeraSort using SIMD instructions by 3.3 times on PowerPC 970MP when sorting 32 M of random 32-bit integers. Furthermore, a parallel version of AA-sort demonstrated better scalability with increasing numbers of cores than a parallel version of GPUTeraSort on both platforms.
机译:过去已经研究了许多排序算法,但是只有少数算法可以有效地利用SIMD指令和线程级并行性。在本文中,我们为共享内存多处理器提出了一种新的并行排序算法,称为对齐访问排序(AA-sort)。 AA排序算法利用了SIMD指令。高性能的关键是消除不对齐的内存访问,这将降低SIMD指令的有效性。我们在PowerPC®970MP和Cell Broadband EngineTM上实施并评估了AA分类。总而言之,在PowerPC 970MP上对32 M随机32位整数进行排序时,使用SIMD指令的AA排序的顺序版本比IBM优化的顺序排序库高1.8倍,而使用SIMD指令的GPUTeraSort则优于3.3倍。此外,与两个平台上的GPUTeraSort并行版本相比,AA-sort的并行版本在内核数量增加时表现出更好的可伸缩性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号