首页> 美国卫生研究院文献>PLoS Computational Biology >Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights
【2h】

Machine Learning Meta-analysis of Large Metagenomic Datasets: Tools and Biological Insights

机译:大型元基因组数据集的机器学习元分析:工具和生物学见解

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Shotgun metagenomic analysis of the human associated microbiome provides a rich set of microbial features for prediction and biomarker discovery in the context of human diseases and health conditions. However, the use of such high-resolution microbial features presents new challenges, and validated computational tools for learning tasks are lacking. Moreover, classification rules have scarcely been validated in independent studies, posing questions about the generality and generalization of disease-predictive models across cohorts. In this paper, we comprehensively assess approaches to metagenomics-based prediction tasks and for quantitative assessment of the strength of potential microbiome-phenotype associations. We develop a computational framework for prediction tasks using quantitative microbiome profiles, including species-level relative abundances and presence of strain-specific markers. A comprehensive meta-analysis, with particular emphasis on generalization across cohorts, was performed in a collection of 2424 publicly available metagenomic samples from eight large-scale studies. Cross-validation revealed good disease-prediction capabilities, which were in general improved by feature selection and use of strain-specific markers instead of species-level taxonomic abundance. In cross-study analysis, models transferred between studies were in some cases less accurate than models tested by within-study cross-validation. Interestingly, the addition of healthy (control) samples from other studies to training sets improved disease prediction capabilities. Some microbial species (most notably Streptococcus anginosus) seem to characterize general dysbiotic states of the microbiome rather than connections with a specific disease. Our results in modelling features of the “healthy” microbiome can be considered a first step toward defining general microbial dysbiosis. The software framework, microbiome profiles, and metadata for thousands of samples are publicly available at .
机译:associated弹枪的人类相关微生物组宏基因组学分析为人类疾病和健康状况下的预测和生物标志物发现提供了丰富的微生物特征。然而,这种高分辨率微生物特征的使用提出了新的挑战,并且缺少用于学习任务的经过验证的计算工具。此外,分类规则几乎没有在独立研究中得到验证,从而提出了关于跨人群疾病预测模型的普遍性和普遍性的问题。在本文中,我们全面评估了基于宏基因组学的预测任务的方法以及对潜在微生物表型关联强度的定量评估。我们使用定量微生物组概况(包括物种水平的相对丰度和菌株特异性标记物的存在)为预测任务开发了计算框架。在来自八项大规模研究的2424个公共可用宏基因组学样本的集合中,进行了全面的荟萃分析,尤其着重于整个队列的泛化。交叉验证显示出良好的疾病预测能力,通常通过特征选择和使用菌株特异性标记物而非物种级分类学丰度来提高这些功能。在交叉研究分析中,研究之间转移的模型在某些情况下不如通过研究内部交叉验证测试的模型准确。有趣的是,将其他研究中的健康(对照)样本添加到训练集中可以改善疾病预测能力。一些微生物物种(最著名的是链球菌链球菌)似乎表征了微生物组的一般营养不良状态,而不是与特定疾病的联系。我们对“健康”微生物组特征进行建模的结果可被认为是定义一般微生物营养不良的第一步。可在上公开获取数千个样本的软件框架,微生物组配置文件和元数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号