【24h】

GPU Acceleration of Pyrosequencing Noise Removal

机译:GPU加速焦点测序噪声清除

获取原文

摘要

Amplicon Noise [1], an updated version of Py-ronoise [2], is a tool for removing noise from metagenomic data recorded by a 454 pyrosequencer. Amplicon Noise has shown to be effective in reducing overestimation of operational taxonomic units (OTUs) and chimera detection. Amplicon-Noise's noise removal method relies on clustering a large set of short sequences read by the sequencer. The DNA sequencing algorithm requires the computation of O(n^2 ) pair wise distances using a global sequence alignment method. Each sequence consists of a few hundred base pairs and a typical dataset contains 104 sequences, making the clustering computation extremely expensive. In this paper we describe of GPU kernel implementation of the most computationally expensive module in the Amplicon Noise software package, SeqDist. With our GPU workstation (Intel Core i7 980 @ 3.33GHz + 3 x NVIDIATesla C2070) and a typical 454 dataset, our implementation achieves a 8.6X (CUDA-SeqDist) speedup with a single GPU when compared with a 12 MPI ranks of the original tools running on the CPU alone. With three GPUs, we achieve a2.1X further speedup over the single GPU version, yielding a total speedup of 18.3X. We measure the throughput of our kernel to be 1.4 giga floating-point cell updates per second(GFCUPS) with a single GPU and 2.9 GFCUPS with 3 GPUs, where GFCUPS refers to the unique method by which the score matrix must be updated in the specialized alignment algorithm used in Amplicon Noise.
机译:扩增子噪声[1],更新版本的Py-Ronoise [2],是一种用于从454辐射静力计记录的偏见数据中去除噪声的工具。扩增子噪声已经有效地减少了操作分类单位(OTUS)和嵌合检测的高度估量。扩增子噪声噪声清除方法依赖于序列器读取的大组短序列。 DNA测序算法需要使用全局序列对准方法计算O(n ^ 2)对方向距离。每个序列由几百个基对组成,典型的数据集包含104个序列,使得聚类计算非常昂贵。在本文中,我们描述了SEQDIST中最大计算昂贵的模块的GPU内核实现。通过我们的GPU工作站(英特尔酷睿i7 980 @ 3.33GHz + 3 x Nvidiatesla C2070)和典型的454数据集,我们的实现实现了8.6倍(CUDA-SEQDIST)加速,与原始的12 MPi等级相比,单个GPU单独在CPU上运行的工具。通过三个GPU,我们通过单个GPU版本实现A2.1x进一步加速,产生了18.3倍的总加速。我们测量我们的内核的吞吐量为1.4 Giga浮点单元每秒(GFCUP)的GPU和2.9 GFCUP,其中2.9个带有3个GPU的GFCUP,其中GFCUP指的是唯一方法,可以在专门中更新分数矩阵的唯一方法扩增子噪声中使用的对齐算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号