【24h】

Parallel Hardware Merge Sorter

机译:并行硬件合并排序

获取原文
获取原文并翻译 | 示例

摘要

Sorting has tremendous usage in the applications that handle massive amount of data. Existing techniques accelerate sorting using multiprocessors or GPGPUs where a data set is partitioned into disjunctive subsets to allow multiple sorting threads working in parallel. Hardware sorters implemented in FPGAs have the potential of providing high-speed and low-energy solutions but the partition algorithms used in software systems are so data dependent that they cannot be easily adopted. The speed of most current sequential sorters still hangs around 1 number/cycle. Recently a new hardware merge sorter broke this speed limit by merging a large number of sorted sequences at a speed proportional to the number of sequences. This paper significantly improves its area and speed scalability by allowing stalls and variable sorting rate. A 32-port parallel merge-tree that merges 32 sequences is implemented in a Virtex-7 FPGA. It merges sequences at an average rate of 31.05 number/cycle and reduces the total sorting time by 160 times compared with traditional sequential sorters.
机译:在处理大量数据的应用程序中,排序具有巨大的用途。现有技术可使用多处理器或GPGPU加速排序,在多处理器或GPGPU中,数据集被划分为分离的子集,以允许多个排序线程并行工作。用FPGA实现的硬件分类器具有提供高速和低能耗解决方案的潜力,但是软件系统中使用的分区算法非常依赖于数据,因此无法轻易采用。当前大多数顺序分拣机的速度仍然停留在每个周期1个数字左右。最近,一个新的硬件合并排序器通过以与序列数成比例的速度合并大量排序的序列,打破了此速度限制。本文通过允许停顿和可变分拣率来显着改善其面积并提高速度可扩展性。在Virtex-7 FPGA中实现了合并32个序列的32端口并行合并树。与传统的顺序分拣机相比,它以31.05个数字/周期的平均速率合并序列,并将总分拣时间减少了160倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号