...
首页> 外文期刊>Bioinformatics >A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio
【24h】

A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio

机译:一种健壮且准确的分型算法,用于任意物种丰度比的宏基因组序列

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: With the rapid development of next-generation sequencing techniques, metagenomics, also known as environmental genomics, has emerged as an exciting research area that enables us to analyze the microbial environment in which we live. An important step for metagenomic data analysis is the identification and taxonomic characterization of DNA fragments (reads or contigs) resulting from sequencing a sample of mixed species. This step is referred to as 'binning'. Binning algorithms that are based on sequence similarity and sequence composition markers rely heavily on the reference genomes of known microorganisms or phylogenetic markers. Due to the limited availability of reference genomes and the bias and low availability of markers, these algorithms may not be applicable in all cases. Unsupervised binning algorithms which can handle fragments from unknown species provide an alternative approach. However, existing unsupervised binning algorithms only work on datasets either with balanced species abundance ratios or rather different abundance ratios, but not both.Results: In this article, we present MetaCluster 3.0, an integrated binning method based on the unsupervised top-down separation and bottom-up merging strategy, which can bin metagenomic fragments of species with very balanced abundance ratios (say 1:1) to very different abundance ratios (e.g. 1:24) with consistently higher accuracy than existing methods.
机译:动机:随着下一代测序技术的快速发展,宏基因组学(也称为环境基因组学)已成为令人兴奋的研究领域,使我们能够分析我们所生活的微生物环境。宏基因组数据分析的重要步骤是对混合物种样品进行测序后得到的DNA片段(读数或重叠群)的鉴定和分类学表征。此步骤称为“合并”。基于序列相似性和序列组成标记的分级算法在很大程度上依赖于已知微生物或系统发生标记的参考基因组。由于参考基因组的可用性有限以及标记的偏倚和可用性较低,因此这些算法可能不适用于所有情况。可以处理未知物种的片段的无监督分箱算法提供了另一种方法。但是,现有的无监督分箱算法仅适用于物种物种丰度比均衡或丰度比不同的数据集,但不能同时适用于两种结果。自下而上的合并策略,该策略可以将丰度比非常均衡(例如1:1)的物种宏基因组片段与非常不同的丰度比(例如1:24)进行合并,其准确性始终高于现有方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号