首页> 外文期刊>Cluster computing >Optimization of SAMtools sorting using OpenMP tasks
【24h】

Optimization of SAMtools sorting using OpenMP tasks

机译:使用OpenMP任务进行SAMTOOLS排序的优化

获取原文
获取原文并翻译 | 示例
       

摘要

SAMtools is a widely-used genomics application for post-processing high-throughput sequence alignment data. Such sequence alignment data are commonly sorted to make downstream analysis more efficient. However, this sorting process itself can be computationally- and I/O-intensive: high-throughput sequence alignment files in the de facto standard binary alignment/map (BAM) format can be many gigabytes in size, and may need to be decompressed before sorting and compressed afterwards. As a result, BAM-file sorting can be a bottleneck in genomics workflows. This paper describes a case study on the performance analysis and optimization of SAMtools for sorting large BAM files. OpenMP task parallelism and memory optimization techniques resulted in a speedup of 5.9X versus the upstream SAMtools 1.3.1 for an internal (in-memory) sort of 24.6 GiB of compressed BAM data (102.6 GiB uncompressed) with 32 processor cores, while a 1.98X speedup was achieved for an external (out-of-core) sort of a 271.4 GiB BAM file.
机译:SAMTOOLS是一种广泛使用的基因组学应用,用于后处理高吞吐量序列对齐数据。通常分类序列对准数据以使下游分析更有效。但是,该分类过程本身可以是计算的 - 并且I / O密集型:在事实上标准二进制对齐/地图(BAM)格式中的高吞吐量序列对齐文件可以是大小的许多千兆字节,并且可能需要在之前解压缩之后排序和压缩。结果,BAM文件排序可以是基因组学工作流程中的瓶颈。本文介绍了对SAMTOOLS进行分类大BAM文件的性能分析和优化的案例研究。 OpenMP任务并行性和内存优化技术导致加速5.9x与上游SAMTOOLS 1.3.1用于内部(内存)的压缩BAM数据(102.6 GIB未压缩)的内部(内存)排序,其中32个处理器核心,而1.98对于271.4 Gib BAM文件的外部(核心外)排序,实现了X加速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号