首页> 外文会议> >Fast generation of long sorted runs for sorting a large file
【24h】

Fast generation of long sorted runs for sorting a large file

机译:快速生成长排序运行以对大型文件进行排序

获取原文

摘要

On sequential machines, most internal sorting algorithms can sort no more than m items using a memory of size m. However, sorting with a heap can produce sorted sequences, called runs, of length about twice the heap size. A second advantage of sorting with a heap is that data I/O and the heap restructuring can be performed concurrently to reduce the sorting time. The third advantage is that it can even sort an arbitrarily large file in one pass if the file satisfies a certain ordering condition. The authors present an algorithm running on a linear array to obtain the above advantages. Specifically, the algorithm can produce runs of length about 2(m+2)p on a linear array of p PEs each with a heap of size m, and can overlap I/O with internal operations. In addition, it can completely sort an arbitrarily large file in one pass providing no item in the file has (m+2)p larger items before it. The algorithm can be modified to perform (mp+2p)-way merges. The authors have also estimated the optimal heap size in each PE for the minimum sorting time.
机译:在顺序机器上,大多数内部排序算法使用大小为m的内存最多可以排序m个项目。但是,使用堆排序可以产生排序的序列,称为运行,其长度约为堆大小的两倍。使用堆排序的第二个优点是可以同时执行数据I / O和堆重组以减少排序时间。第三个优点是,如果文件满足特定的排序条件,它甚至可以一次对任意大的文件进行排序。作者提出了一种在线性阵列上运行的算法,以获得上述优势。具体来说,该算法可以在线性排列的p个PE的线性阵列上生成长度约为2(m + 2)p的游程,并且每个PE的堆大小为m,并且I / O与内部操作重叠。此外,如果文件中没有任何项目具有(m + 2)p个较大的项目,则它可以一次完成对任意一个大型文件的完整排序。可以修改该算法以执行(mp + 2p)方式合并。作者还估计了在最短排序时间内每个PE的最佳堆大小。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号