首页> 外文会议>IEEE International Conference on Bioinformatics and Biomedicine >HPMA: High-performance metagenomic alignment tool, on a large-scale GPU cluster
【24h】

HPMA: High-performance metagenomic alignment tool, on a large-scale GPU cluster

机译:HPMA:高性能宏基因组校正工具,在大型GPU集群上

获取原文

摘要

AIn this paper, we present HPMA, a graphics processing unit (GPU) accelerated meta-genome sequence alignment algorithm for a collection of DNA sequences. This algorithm supports all-to-all pairwise local alignment on NVIDIA GPUs. HPMA builds on an GPU alignment algorithm that we developed earlier with the addition of a ilter module. We designed and developed this new kernel function based on the suix array data structure. The ilter module improves performance by identifying a subset of sequences which meet a user-deined similarity threshold and should be considered for alignment. HPMA has the ability to balance the workload between CPU and GPU. HPMA allows us to preprocess massively large metagenomes in a reasonable amount of time in response to increasing speed of NGS sequencers. The performance of HPMA has been evaluated on a cluster of Kepler-based Tesla K20 GPUs using a variety of short DNA sequence datasets. We evaluate HPMA thoroughly with four test datasets. The irst two test sets are comprised of 10 simulated datasets where read length varies from 72 to 750 base-pairs. The third test set is designed to allow a comparison with published results for GSWABE, a competing GPU alignment tool. The fourth test set is an actual metagenome of over 2 million sequences with an average length of 270 bp. We utilized a cluster of NVIDIA-K20 GPUs in the Stampede supercomputer at the Texas Advanced Computing Center (Austin, TX, USA). When running on a cluster of 10 NVIDIA K20 GPUs, HPMA is able to align 2 million simulated metagenome sequences of length 300 bp in 160 seconds. In the case of real metagenomic data, HPMA is able to align 2,038,516 sequences with an average length of 270 bp in 60 seconds.
机译:答:在本文中,我们介绍了HPMA,这是一种图形处理单元(GPU)加速的元基因组序列比对算法,用于收集DNA序列。该算法支持NVIDIA GPU上的所有对成对的局部对齐。 HPMA建立在我们之前开发的GPU对齐算法的基础上,其中增加了ilter模块。我们基于suix数组数据结构设计和开发了这个新的内核功能。过滤模块通过识别满足用户定义的相似性阈值并应考虑进行比对的序列子集来提高性能。 HPMA能够平衡CPU和GPU之间的工作负载。 HPMA使我们能够在合理的时间内预处理大规模的大型基因组,以响应NGS测序仪不断提高的速度。已使用各种短DNA序列数据集在基于开普勒的Tesla K20 GPU集群上评估了HPMA的性能。我们使用四个测试数据集对HPMA进行了全面评估。前两个测试集由10个模拟数据集组成,其中读取长度从72到750个碱基对变化。第三个测试集旨在与竞争的GPU对齐工具GSWABE的发布结果进行比较。第四个测试集是超过200万个序列的实际元基因组,平均长度为270 bp。我们在德克萨斯州高级计算中心(美国德克萨斯州奥斯汀)的Stampede超级计算机中使用了NVIDIA-K20 GPU集群。当在10个NVIDIA K20 GPU的集群上运行时,HPMA能够在160秒内对齐200万个模拟的长300 bp的元基因组序列。对于真实的宏基因组学数据,HPMA能够在60秒内比对平均长度为270 bp的2,038,516个序列。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号