首页> 外文会议>IEEE International Congress on Big Data >Using machine learning to identify major shifts in human gut microbiome protein family abundance in disease
【24h】

Using machine learning to identify major shifts in human gut microbiome protein family abundance in disease

机译:利用机器学习确定人类肠道微生物组蛋白家族在疾病中的丰度的重大变化

获取原文

摘要

Inflammatory Bowel Disease (IBD) is an autoimmune condition that is observed to be associated with major alterations in the gut microbiome taxonomic composition. Here we classify major changes in microbiome protein family abundances between healthy subjects and IBD patients. We use machine learning to analyze results obtained previously from computing relative abundance of ~10,000 KEGG orthologous protein families in the gut microbiome of a set of healthy individuals and IBD patients. We develop a machine learning pipeline, involving the Kolomogorv-Smirnov test, to identify the 100 most statistically significant entries in the KEGG database. Then we use these 100 as a training set for a Random Forest classifier to determine ~5% the KEGGs which are best at separating disease and healthy states. Lastly, we developed a Natural Language Processing classifier of the KEGG description files to predict KEGG relative over-or under-abundance. As we expand our analysis from 10,000 KEGG protein families to one million proteins identified in the gut microbiome, scalable methods for quickly identifying such anomalies between health and disease states will be increasingly valuable for biological interpretation of sequence data.
机译:炎症性肠病(IBD)是一种自身免疫性疾病,与肠道微生物组分类学组成的重大变化有关。在这里,我们对健康受试者和IBD患者之间微生物组蛋白家族丰度的主要变化进行了分类。我们使用机器学习来分析先前通过计算一组健康个体和IBD患者的肠道微生物组中约10,000个KEGG直系同源蛋白家族的相对丰度而获得的结果。我们开发了涉及Kolomogorv-Smirnov检验的机器学习管道,以识别KEGG数据库中100个最具有统计意义的条目。然后,我们将这100个用作随机森林分类器的训练集,以确定大约5%的KEGG(最适合于将疾病和健康状态分开)。最后,我们开发了KEGG描述文件的自然语言处理分类器,以预测KEGG相对过剩或不足。随着我们将分析范围从10,000个KEGG蛋白质家族扩展到在肠道微生物组中鉴定的100万个蛋白质,快速鉴定健康和疾病状态之间异常的可扩展方法对于序列数据的生物学解释将越来越有价值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号