...
首页> 外文期刊>Microbiome >Learning Microbial Community Structures with Supervised and Unsupervised Non-negative Matrix Factorization
【24h】

Learning Microbial Community Structures with Supervised and Unsupervised Non-negative Matrix Factorization

机译:通过监督和无监督的非负矩阵分解学习微生物群落结构

获取原文

摘要

BackgroundLearning the structure of microbial communities is critical in understanding the different community structures and functions of microbes in distinct individuals. We view microbial communities as consisting of many subcommunities which are formed by certain groups of microbes functionally dependent on each other. The focus of this paper is on methods for extracting the subcommunities from the data, in particular Non-Negative Matrix Factorization (NMF). Our methods can be applied to both OTU data and functional metagenomic data. We apply the existing unsupervised NMF method and also develop a new supervised NMF method for extracting interpretable information from classification problems. ResultsThe relevance of the subcommunities identified by NMF is demonstrated by their excellent performance for classification. Through three data examples, we demonstrate how to interpret the features identified by NMF to draw meaningful biological conclusions and discover hitherto unidentified patterns in the data.Comparing whole metagenomes of various mammals, (Muegge et al., Science 332:970–974, 2011), the biosynthesis of macrolides pathway is found in hindgut-fermenting herbivores, but not carnivores. This is consistent with results in veterinary science that macrolides should not be given to non-ruminant herbivores. For time series microbiome data from various body sites (Caporaso et al., Genome Biol 12:50, 2011), a shift in the microbial communities is identified for one individual. The shift occurs at around the same time in the tongue and gut microbiomes, indicating that the shift is a genuine biological trait, rather than an artefact of the method. For whole metagenome data from IBD patients and healthy controls (Qin et al., Nature 464:59–65, 2010), we identify differences in a number of pathways (some known, others new). ConclusionsNMF is a powerful tool for identifying the key features of microbial communities. These identified features can not only be used to perform difficult classification problems with a high degree of accuracy, they are also very interpretable and can lead to important biological insights into the structure of the communities. In addition, NMF is a dimension-reduction method (similar to PCA) in that it reduces the extremely complex microbial data into a low-dimensional representation, allowing a number of analyses to be performed more easily—for example, searching for temporal patterns in the microbiome. When we are interested in the differences between the structures of two groups of communities, supervised NMF provides a better way to do this, while retaining all the advantages of NMF—e.g. interpretability and a simple biological intuition.
机译:背景技术了解微生物群落的结构对于理解不同个体中微生物的不同群落结构和功能至关重要。我们将微生物群落视为由许多亚群落组成,这些亚群落是由功能上相互依赖的某些微生物组形成的。本文的重点是从数据中提取子社区的方法,特别是非负矩阵分解(NMF)。我们的方法可以应用于OTU数据和功能宏基因组数据。我们应用了现有的无监督NMF方法,并开发了一种新的有监督NMF方法来从分类问题中提取可解释的信息。结果NMF所识别的子社区的相关性通过其出色的分类性能得到了证明。通过三个数据示例,我们演示了如何解释由NMF识别的特征,以得出有意义的生物学结论并发现数据中迄今尚未发现的模式。比较各种哺乳动物的整个元基因组,(Muegge等人,Science 332:970–974,2011 ),大环内酯类途径的生物合成存在于后肠发酵的草食动物中,但不是食肉动物。这与兽医学的结果一致,即大环内酯类药物不宜用于非反刍动物的草食动物。对于来自各个身体部位的时间序列微生物组数据(Caporaso等人,Genome Biol 12:50,2011),对于一个人而言,微生物群落发生了变化。这种转变在舌头和肠道微生物区系中几乎同时发生,这表明这种转变是一种真正的生物学特性,而不是该方法的人工制品。对于来自IBD患者和健康对照者的完整元基因组数据(Qin等,自然464:59-65,2010),我们确定了许多途径的差异(一些已知,另一些则是新的)。结论NMF是确定微生物群落关键特征的强大工具。这些已识别的特征不仅可以用于高度准确地执行棘手的分类问题,而且还可以很好地解释这些特征,并且可以导致对群落结构的重要生物学见解。此外,NMF是一种降维方法(类似于PCA),因为它可以将极其复杂的微生物数据还原为低维表示形式,从而可以更轻松地执行许多分析,例如,在其中搜索时间模式。微生物组。当我们对两组社区的结构之间的差异感兴趣时,受监督的NMF提供了一种更好的方法,同时保留了NMF的所有优势,例如可解释性和简单的生物学直觉。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号