Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads

机译：用于大规模和异质DNA测序读段的装箱的Poisson-Markov混合模型和并行算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A major computational challenge in analyzing metagenomics sequencing reads is to identify unknown sources of massive and heterogeneous short DNA reads. A promising approach is to efficiently and sufficiently extract and exploit sequence features, i.e., k-mers, to bin the reads according to their sources. Shorter k-mers may capture base composition information while longer k-mers may represent reads abundance information. We present a novel Poisson-Markov mixture Model (PMM) to systematically integrate the information in both long and short k-mers and develop a parallel algorithm for improving both reads binning performance and running time. We compare the performance and running time of our PMM approach with selected competing approaches using simulated data sets, and we also demonstrate the utility of our PMM approach using a time course metagenomics data set. The probabilistic modeling framework is sufficiently flexible and general to solve a wide range of supervised and unsupervised learning problems in metagenomics.

机译：分析宏基因组学测序读段的主要计算挑战是识别大量和异质短DNA读段的未知来源。一种有前途的方法是有效和充分地提取和利用序列特征，即k聚体，以根据其来源对读段进行分类。较短的k-mers可捕获碱基组成信息，而较长的k-mers可代表读取的丰度信息。我们提出了一种新颖的Poisson-Markov混合模型（PMM），以系统地将信息整合到长k-mer和短k-mers中，并开发了一种并行算法来改善读取装箱性能和运行时间。我们使用模拟数据集将我们的PMM方法的性能和运行时间与选定的竞争方法进行了比较，并且还使用时程宏基因组学数据集演示了PMM方法的实用性。概率建模框架具有足够的灵活性和通用性，可以解决宏基因组学中各种有监督和无监督的学习问题。

著录项

来源
《International symposium on bioinformatics research and applications》|2016年|15-26|共12页
会议地点
作者
Lu Wang; Dongxiao Zhu; Yan Li; Ming Dong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Probabilistic clustering; Expectation-Maximization algorithm; Metagenomics; Next-generation sequencing (NGS); Parallel algorithm;

机译：概率聚类;期望最大化算法;元基因组学;下一代测序（NGS）;并行算法;

相似文献

外文文献
中文文献
专利

1. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing [J] . Cristian Coarfa, Fuli Yu, Christopher A Miller, BMC Bioinformatics . 2010,第1期

机译：Pash 3.0：多功能软件包，使用大规模平行DNA测序进行基因组和表观基因组变异的读图和整合分析
2. Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing. [J] . Morin R, Bainbridge M, Fejes A, BioTechniques . 2008,第1期

机译：使用随机引发的cDNA和大规模并行的短读测序对HeLa S3转录组进行分析。
3. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing [J] . M. Heath Farris, Andrew R. Scott, Pamela A. Texter, BMC Bioinformatics . 2018,第1期

机译：TIA：用于开发与身份相关的SNP岛的算法，可通过大规模并行DNA测序进行分析
4. Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads [C] . Lu Wang, Dongxiao Zhu, Yan Li, International Symposium on Bioinformatics Research and Applications . 2016

机译：泊松 - 马尔可夫混合模型及其融合大规模和异源性DNA测序读数的平行算法
5. Using Clonal Massively Parallel Sequencing to Characterize Heteroplasmy in the mtDNA of Human Head Hair, Pubic Hair, and Buccal Samples [D] . Laurie, Erin Ann. 2016

机译：使用大规模并行测序来表征人类头发，阴毛和颊样品中mtDNA的异质性
6. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing [O] . M. Heath Farris, Andrew R. Scott, Pamela A. Texter, 2018

机译：TIA：用于开发与身份相关的SNP岛的算法可通过大规模并行DNA测序进行分析
7. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing [O] . Chen Zuozhou, Miller Christopher A, Yu Fuli, 2010

机译：Pash 3.0：多功能软件包，使用大规模平行DNA测序进行基因组和表观基因组变异的读图和整合分析

Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads

摘要

著录项

相似文献

相关主题

期刊订阅