首页> 外文会议>2014 3rd International Conference on Parallel Distributed and Grid Computing >Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization
【24h】

Performance improvement of BWA MEM algorithm using data-parallel with concurrent parallelization

机译:数据并行与并行并行化提高BWA MEM算法的性能

获取原文
获取原文并翻译 | 示例

摘要

Burrows-Wheeler Transform (BWT) is the widely used data compression technique in the next-generation sequencing (NGS) analysis. Due to the advancement in the NGS technology, the genome data size was increased rapidly and these higher volumes of genome data need to be processed by empirical parallelism. Generally, these NGS data will be processed by traditional parallel processing approaches like (i) thread parallelization (ii) Data parallelization and (iii) Concurrent parallelization, which are their own performance bottlenecks in, thread scalability, scattering/gathering of data and memory bandwidth limitations respectively. To eliminate these drawbacks, we introduced the hybrid parallelization approach called “data-parallel with concurrent parallelization” to process our genome alignment. We used BWA MEM algorithm for aligning human genome sequence, which are dominated by huge memory intensive operations and the performance is limited due to cache/TLB misses. To eliminate the cache/TLB miss, the genome data is partitioned into multiple pieces (i.e., reducing the read size) using data parallelization and concurrently processing these multiple pieces of genome data within the same cache/memory hierarchy. Hence, the performance of proposed data-parallel with concurrent parallelization is 45% better than traditional parallelization approaches. Additionally, we provided proof of concept to process higher volume of genome data using BWA MEM algorithm on the high-end desktop machines.
机译:Burrows-Wheeler变换(BWT)是下一代测序(NGS)分析中广泛使用的数据压缩技术。由于NGS技术的进步,基因组数据的大小迅速增加,而这些更大数量的基因组数据需要通过经验并行处理。通常,这些NGS数据将通过传统的并行处理方法进行处理,例如(i)线程并行化(ii)数据并行化和(iii)并发并行化,它们是它们自身的性能瓶颈,线程可伸缩性,数据的分散/聚集和内存带宽限制。为了消除这些缺陷,我们引入了混合并行化方法,称为“数据并行与并行并行化”,以处理我们的基因组比对。我们使用BWA MEM算法来比对人类基因组序列,该算法主要由大量内存密集型操作主导,并且由于缓存/ TLB丢失而导致性能受到限制。为了消除高速缓存/ TLB遗漏,使用数据并行化并同时在相同的高速缓存/存储器层次结构中处理这些多组基因组数据,将基因组数据划分成多个片段(即,减小读取大小)。因此,与并行并行方法相比,所提出的并行并行数据并行处理性能提高了45%。此外,我们提供了在高端台式机上使用BWA MEM算法处理大量基因组数据的概念证明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号