Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads

机译：泊松 - 马尔可夫混合模型及其融合大规模和异源性DNA测序读数的平行算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A major computational challenge in analyzing metagenomics sequencing reads is to identify unknown sources of massive and heterogeneous short DNA reads. A promising approach is to efficiently and sufficiently extract and exploit sequence features, i.e., k-mers, to bin the reads according to their sources. Shorter k-mers may capture base composition information while longer k-mers may represent reads abundance information. We present a novel Poisson-Markov mixture Model (PMM) to systematically integrate the information in both long and short k-mers and develop a parallel algorithm for improving both reads binning performance and running time. We compare the performance and running time of our PMM approach with selected competing approaches using simulated data sets, and we also demonstrate the utility of our PMM approach using a time course metagenomics data set. The probabilistic modeling framework is sufficiently flexible and general to solve a wide range of supervised and unsupervised learning problems in metagenomics.

机译：分析Metagenomics测序读取的主要计算挑战是识别巨大和异质短DNA读数的未知来源。有希望的方法是有效，充分提取和充分提取和利用序列特征，即K-MERS，根据其来源键入读数。较短的K-MERS可以捕获基础组成信息，而较长的K-MERS可以表示读取丰富信息。我们提出了一种小说泊松 - 马尔可夫混合模型（PMM），以系统地集成了长短K-MERS中的信息，并开发了一个并行算法，以改善读取分融合性能和运行时间。我们将PMM方法的性能和运行时间与使用模拟数据集的选定竞争方法进行比较，我们还使用时间课程Metagenomics数据集展示了PMM方法的实用性。概率造型框架是足够灵活的，一般可以解决偏心神经中的广泛监督和无人监督的学习问题。

著录项

来源
《International Symposium on Bioinformatics Research and Applications》|2016年|348p|共12页
会议地点
作者
Lu Wang; Dongxiao Zhu; Yan Li; Ming Dong;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311-53;
关键词
Probabilistic clustering; Expectation-Maximization algorithm; Metagenomics; Next-generation sequencing (NGS); Parallel algorithm;

机译：概率聚类;期望最大化算法;偏心组合;下一代测序（NGS）;并行算法;

相似文献

外文文献
中文文献
专利

1. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing [J] . Cristian Coarfa, Fuli Yu, Christopher A Miller, BMC Bioinformatics . 2010,第1期

机译：Pash 3.0：多功能软件包，使用大规模平行DNA测序进行基因组和表观基因组变异的读图和整合分析
2. Profiling the HeLa S3 transcriptome using randomly primed cDNA and massively parallel short-read sequencing. [J] . Morin R, Bainbridge M, Fejes A, BioTechniques . 2008,第1期

机译：使用随机引发的cDNA和大规模并行的短读测序对HeLa S3转录组进行分析。
3. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing [J] . M. Heath Farris, Andrew R. Scott, Pamela A. Texter, BMC Bioinformatics . 2018,第1期

机译：TIA：用于开发与身份相关的SNP岛的算法，可通过大规模并行DNA测序进行分析
4. Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads [C] . Lu Wang, Dongxiao Zhu, Yan Li, International symposium on bioinformatics research and applications . 2016

机译：用于大规模和异质DNA测序读段的装箱的Poisson-Markov混合模型和并行算法
5. Using Clonal Massively Parallel Sequencing to Characterize Heteroplasmy in the mtDNA of Human Head Hair, Pubic Hair, and Buccal Samples [D] . Laurie, Erin Ann. 2016

机译：使用大规模并行测序来表征人类头发，阴毛和颊样品中mtDNA的异质性
6. TIA: algorithms for development of identity-linked SNP islands for analysis by massively parallel DNA sequencing [O] . M. Heath Farris, Andrew R. Scott, Pamela A. Texter, 2018

机译：TIA：用于开发与身份相关的SNP岛的算法可通过大规模并行DNA测序进行分析
7. Pash 3.0: A versatile software package for read mapping and integrative analysis of genomic and epigenomic variation using massively parallel DNA sequencing [O] . Chen Zuozhou, Miller Christopher A, Yu Fuli, 2010

机译：Pash 3.0：多功能软件包，使用大规模平行DNA测序进行基因组和表观基因组变异的读图和整合分析

Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads

摘要

著录项

相似文献

相关主题

期刊订阅