...
首页> 外文期刊>Nucleic Acids Research >Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition
【24h】

Unsupervised discovery of microbial population structure within metagenomes using nucleotide base composition

机译:使用核苷酸碱基组成无监督地发现超基因组内微生物种群结构

获取原文
获取原文并翻译 | 示例
           

摘要

An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis.
机译:推断元基因组中未知微生物种群结构的一种方法是根据碱基组成中的常见模式对核苷酸序列进行聚类,也称为分箱。当将功能角色分配给确定的人群时,与探索整体功能的以基因为中心的方法相比,可以更深入地了解微生物群落。在这项研究中,我们提出了一种基于监督的,具有两个聚类层的基于模型的分箱方法,该方法使用寡核苷酸频率衍生的误差梯度和GC含量的新颖转换来在第一层聚类中生成粗糙的基团。和四核苷酸频率以在二级聚类层上精炼这些基团。在本研究中用于评估的所有三个基准上,所提出的方法均已证明优于疫霉菌,S-GSOM,TACOA和TaxSOM。然后将拟议的方法应用于台湾西南部采样的火山泥沉积物的热测序宏基因组学文库,并根据16S核糖体RNA标记基因的互补序列对推论的种群结构进行了验证。最后,针对四个公开可用的元基因组进一步验证了所提出的方法,包括一个高度复杂的南极鲸鱼掉落的骨样品,该样品先前被认为对于功能分析之前的装箱太复杂。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号