...
首页> 外文期刊>Scientific reports. >Establishment and evaluation of prediction model for multiple disease classification based on gut microbial data
【24h】

Establishment and evaluation of prediction model for multiple disease classification based on gut microbial data

机译:基于肠道微生物数据的多种疾病分类预测模型的建立与评价

获取原文

摘要

Diseases prediction has been performed by machine learning approaches with various biological data. One of the representative data is the gut microbial community, which interacts with the host's immune system. The abundance of a few microorganisms has been used as markers to predict diverse diseases. In this study, we hypothesized that multi-classification using machine learning approach could distinguish the gut microbiome from following six diseases: multiple sclerosis, juvenile idiopathic arthritis, myalgic encephalomyelitis/chronic fatigue syndrome, acquired immune deficiency syndrome, stroke and colorectal cancer. We used the abundance of microorganisms at five taxonomy levels as features in 696 samples collected from different studies to establish the best prediction model. We built classification models based on four multi-class classifiers and two feature selection methods including a?forward selection and a?backward elimination. As a result, we found that the performance of classification is improved as we use the lower taxonomy levels of features; the highest performance was observed at the genus level. Among four classifiers, LogitBoost-based prediction model outperformed other classifiers. Also, we suggested the optimal feature subsets at the genus-level obtained by backward elimination. We believe the selected feature subsets could be used as markers to distinguish various diseases simultaneously. The finding in this study suggests the potential use of selected features for the diagnosis of several diseases.
机译:通过具有各种生物数据的机器学习方法进行了疾病预测。其中一个代表性数据是肠道微生物群落,其与宿主的免疫系统相互作用。少数微生物的丰度已被用作标记以预测多种疾病。在这项研究中,我们假设使用机器学习方法的多分类可以将肠道微生物组分享到以下六种疾病:多发性硬化症,幼年特发性关节炎,肌间脑髓炎/慢性疲劳综合征,患有免疫缺乏综合征,中风和结直肠癌。我们在从不同研究中收集的696个样本中的特征在五个分类水平下使用了丰富的微生物,以建立最佳预测模型。我们基于四个多级分类器和两个特征选择方法构建了分类模型,包括α前进选择和倒置。结果,我们发现,随着我们使用较低的分类特征水平,分类的性能得到改善;在属级别观察到最高的性能。在四个分类器中,基于LogitBoost的预测模型表现优于其他分类器。此外,我们建议通过向后消除获得的属级的最佳特征子集。我们认为所选的特征子集可以用作标记,以同时区分各种疾病。本研究中的发现表明潜在使用所选择的特征来诊断几种疾病。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号