首页> 外文会议>International Conference on Pattern Recognition in Bioinformatics >CRiSPy-CUDA: Computing Species Richness in 16S rRNA Pyrosequencing Datasets with CUDA
【24h】

CRiSPy-CUDA: Computing Species Richness in 16S rRNA Pyrosequencing Datasets with CUDA

机译:CUTPY-CUDA:使用CUDA的16S rRNA焦点测定数据集(CUDA)计算物种丰富性

获取原文

摘要

Pyrosequencing technologies are frequently used for sequencing the 16S rRNA marker gene for metagenomic studies of microbial communities. Computing a pairwise genetic distance matrix from the produced reads is an important but highly time consuming task. In this paper, we present a parallelized tool (called CRiSPy) for scalable pairwise genetic distance matrix computation and clustering that is based on the processing pipeline of the popular ESPRIT software package. To achieve high computational efficiency, we have designed massively parallel CUDA algorithms for pairwise k-mer distance and pairwise genetic distance computation. We have also implemented a memory-efficient sparse matrix clustering program to process the distance matrix. On a single-GPU, CRiSPy achieves speedups of around two orders of magnitude compared to the sequential ESPRIT program for both the time-consuming pairwise genetic distance module and the whole processing pipeline, thus making CRiSPy particularly suitable for high-throughput microbial studies.
机译:焦塞技术经常用于测序16S rRNA标志物基因进行微生物群落的偏见研究。从所产生的读取计算成对遗传距离矩阵是一个重要但高度耗时的任务。在本文中,我们呈现了一种并行化工具(称为CRISPY),可用于可扩展的成对遗传距离矩阵计算和聚类,其基于流行的ESPRIT软件包的处理流水线。为了实现高计算效率,我们设计了用于成对K-MER距离和成对遗传距离计算的大规模平行的CUDA算法。我们还实现了一个内存有效的稀疏矩阵聚类程序来处理距离矩阵。在单个GPU上,与耗时的成对遗传距离模块和整个处理管道的顺序ESPRIT程序相比,脆性达到大约两个数量级的加速度,从而使得脆性特别适用于高通量微生物研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号