...
首页> 外文期刊>PLoS One >Exploring thematic structure and predicted functionality of 16S rRNA amplicon data
【24h】

Exploring thematic structure and predicted functionality of 16S rRNA amplicon data

机译:探索16S rRNA扩增子数据的主题结构和预测功能

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Analysis of microbiome data involves identifying co-occurring groups of taxa associated with sample features of interest (e.g., disease state). Elucidating such relations is often difficult as microbiome data are compositional, sparse, and have high dimensionality. Also, the configuration of co-occurring taxa may represent overlapping subcommunities that contribute to sample characteristics such as host status. Preserving the configuration of co-occurring microbes rather than detecting specific indicator species is more likely to facilitate biologically meaningful interpretations. Additionally, analyses that use taxonomic relative abundances to predict the abundances of different gene functions aggregate predicted functional profiles across taxa. This precludes straightforward identification of predicted functional components associated with subsets of co-occurring taxa. We provide an approach to explore co-occurring taxa using “topics” generated via a topic model and link these topics to specific sample features (e.g., disease state). Rather than inferring predicted functional content based on overall taxonomic relative abundances, we instead focus on inference of functional content within topics, which we parse by estimating interactions between topics and pathways through a multilevel, fully Bayesian regression model. We apply our methods to three publicly available 16S amplicon sequencing datasets: an inflammatory bowel disease dataset, an oral cancer dataset, and a time-series dataset. Using our topic model approach to uncover latent structure in 16S rRNA amplicon surveys, investigators can (1) capture groups of co-occurring taxa termed topics; (2) uncover within-topic functional potential; (3) link taxa co-occurrence, gene function, and environmental/host features; and (4) explore the way in which sets of co-occurring taxa behave and evolve over time. These methods have been implemented in a freely available R package: https://cran.r-project.org/package=themetagenomics , https://github.com/EESI/themetagenomics .
机译:微生物组数据分析涉及鉴定与感兴趣的样本特征相关的共同发生的分类群(例如,疾病状态)。阐明这种关系通常是困难的,因为微生物组数据是组成,稀疏和具有高维度。此外,共同出现的分类群的配置可以代表重叠的子信子,这些子汇率有助于采样特征,例如主机状态。保留共同发生的微生物的配置而不是检测特定的指示物种更有可能促进生物学上有意义的解释。另外,使用分类学相对丰富来预测不同基因功能的丰度的分析聚集在分类群中预测的功能谱。这排除了与共同出现的征草群相关联的预测功能组分的直接识别。我们提供了一种方法,可以使用主题模型生成的“主题”来探索共同出现的分类群,并将这些主题链接到特定的样本特征(例如,疾病状态)。而不是基于整体分类的相对丰富推断出预测的功能内容,而是专注于主题中的功能内容的推动,而我们通过通过多级,完全贝叶斯回归模型估计主题和途径之间的相互作用来解析。我们将方法应用于三个公开可用的16S扩增子测序数据集:炎症性肠病数据集,口腔癌数据集和时间序列数据集。使用我们的主题模型方法在16S rRNA扩增子调查中揭露潜在结构,调查人员可以(1)捕获共同出现的纳税群体的主题; (2)揭示主题功能潜力; (3)链接征集征集共同发生,基因功能和环境/主机特征; (4)探讨共同出现的征集行为和随着时间的推移发展的方式。这些方法已在自由可用的R包中实现:https://cran.r-project.org/package=themetageNomics,https://github.com/eesi/themetagenomics。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号