首页> 外文期刊>IEEE Transactions on Computers >MONTRES : Merge ON-the-Run External Sorting Algorithm for Large Data Volumes on SSD Based Storage Systems
【24h】

MONTRES : Merge ON-the-Run External Sorting Algorithm for Large Data Volumes on SSD Based Storage Systems

机译:MONTRES:在基于SSD的存储系统上合并运行中的大数据量外部排序算法

获取原文
获取原文并翻译 | 示例

摘要

External sorting algorithms are commonly used by data-centric applications to sort quantities of data that are larger than the main-memory. Many external sorting algorithms were proposed in state-of-the-art studies to take advantage of SSD performance properties to accelerate the sorting process. In this paper, we demonstrate that unfortunately, many of those algorithms fail to scale when it comes to increasing the dataset size under memory pressure. In order to address this issue, we propose a new sorting algorithm named MONTRES. MONTRES relies on SSD performance model while decreasing the overall number of I/O operations. It does this by reducing the amount of temporary data generated during the sorting process by continuously evicting small values in the final sorted file. MONTRES scales well with growing datasets under memory pressure. We tested MONTRES using several data distributions, different amounts of main-memory workspace and three SSD models. Results showed that MONTRES outperforms state-of-the-art algorithms as it reduces the sorting execution time of TPC-H datasets by more than 30 percent when the file size to main-memory size ratio is high.
机译:以数据为中心的应用程序通常使用外部排序算法来排序比主内存大的数据量。最新的研究中提出了许多外部分类算法,以利用SSD的性能特性来加速分类过程。在本文中,我们证明了不幸的是,当涉及到在内存压力下增加数据集大小时,其中许多算法都无法扩展。为了解决此问题,我们提出了一种新的排序算法MONTRES。 MONTRES依靠SSD性能模型,同时减少了I / O操作的总数。它通过减少连续排序中最终文件中的小值来减少排序过程中生成的临时数据量,从而达到这一目的。在内存压力下,MONTRES可以随着数据集的增长而很好地扩展。我们使用几种数据分布,不同数量的主内存工作区和三种SSD模型测试了MONTRES。结果表明,当文件大小与主内存大小之比很高时,MONTRES优于最新的算法,因为它可以将TPC-H数据集的排序执行时间缩短30%以上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号