首页> 外文会议>International Symposium on Bioinformatics Research and Applications >Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads
【24h】

Poisson-Markov Mixture Model and Parallel Algorithm for Binning Massive and Heterogenous DNA Sequencing Reads

机译:泊松 - 马尔可夫混合模型及其融合大规模和异源性DNA测序读数的平行算法

获取原文

摘要

A major computational challenge in analyzing metagenomics sequencing reads is to identify unknown sources of massive and heterogeneous short DNA reads. A promising approach is to efficiently and sufficiently extract and exploit sequence features, i.e., k-mers, to bin the reads according to their sources. Shorter k-mers may capture base composition information while longer k-mers may represent reads abundance information. We present a novel Poisson-Markov mixture Model (PMM) to systematically integrate the information in both long and short k-mers and develop a parallel algorithm for improving both reads binning performance and running time. We compare the performance and running time of our PMM approach with selected competing approaches using simulated data sets, and we also demonstrate the utility of our PMM approach using a time course metagenomics data set. The probabilistic modeling framework is sufficiently flexible and general to solve a wide range of supervised and unsupervised learning problems in metagenomics.
机译:分析Metagenomics测序读取的主要计算挑战是识别巨大和异质短DNA读数的未知来源。有希望的方法是有效,充分提取和充分提取和利用序列特征,即K-MERS,根据其来源键入读数。较短的K-MERS可以捕获基础组成信息,而较长的K-MERS可以表示读取丰富信息。我们提出了一种小说泊松 - 马尔可夫混合模型(PMM),以系统地集成了长短K-MERS中的信息,并开发了一个并行算法,以改善读取分融合性能和运行时间。我们将PMM方法的性能和运行时间与使用模拟数据集的选定竞争方法进行比较,我们还使用时间课程Metagenomics数据集展示了PMM方法的实用性。概率造型框架是足够灵活的,一般可以解决偏心神经中的广泛监督和无人监督的学习问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号