首页> 外国专利> Method and apparatus for partitioning and sorting a data set on a multi-processor system

Method and apparatus for partitioning and sorting a data set on a multi-processor system

机译:在多处理器系统上对数据集进行分区和排序的方法和装置

摘要

The present invention provides a method and apparatus for partitioning, sorting a data set on a multi-processor system. Herein, the multi-processor system has at least one core processor and a plurality of accelerators. The method for partitioning a data set comprises: partitioning iteratively said data set into a plurality of buckets corresponding to different data ranges by using said plurality of accelerators in parallel, wherein each of the plurality of buckets could be stored in local storage of said plurality of accelerators; wherein in each iteration, the method comprises: roughly partitioning said data set into a plurality of large buckets; obtaining parameters of said data set that can indicate the distribution of data values in that data set; determining a plurality of data ranges for said data set based on said parameters; and partitioning said plurality of large buckets into a plurality of small buckets corresponding to the plurality of data ranges respectively by using said plurality of accelerators in parallel, wherein each of said plurality of accelerators, for each element in the large bucket it is partitioning, determines a data range to which that element belongs among the plurality of data ranges by computation.
机译:本发明提供了一种用于在多处理器系统上对数据集进行分区,排序的方法和装置。在此,多处理器系统具有至少一个核心处理器和多个加速器。用于划分数据集的方法包括:通过并行地使用所述多个加速器,将所述数据集迭代地划分为与不同数据范围相对应的多个桶,其中,所述多个桶中的每个可以存储在所述多个桶的本地存储中。加速器;其中,在每次迭代中,该方法包括:将所述数据集粗略地划分为多个大桶;获得所述数据集的参数,该参数可以指示该数据集中的数据值的分布;基于所述参数确定所述数据集的多个数据范围;通过并行使用所述多个加速器,将所述多个大桶分别划分为与多个数据范围相对应的多个小桶,其中,对于所述大桶中的每个元素,所述多个加速器中的每一个确定分区通过计算,在多个数据范围中该元素所属的数据范围。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号