首页> 外文会议>2012 IEEE 6th International Conference on Systems Biology. >A machine learning framework of functional biomarker discovery for different microbial communities based on metagenomic data
【24h】

A machine learning framework of functional biomarker discovery for different microbial communities based on metagenomic data

机译:基于宏基因组学数据的针对不同微生物群落的功能性生物标记物发现的机器学习框架

获取原文
获取原文并翻译 | 示例

摘要

As more than 90% of microbial community could not be isolated and cultivated,the metagenomic methods have been commonly used to analyze the microbial community as a whole.With the fast acumination of metagenomic samples,it is now intriguing to find simple biomarkers,especially functional biomarkers,which could distinguish different metagenomic samples.Next-generation sequencing techniques have enabled the detection of very accurate gene-presence (abundance) values in metagenomic studies.And the presence/absence or different abundance values for a set of genes could be used as appropriate biomarker for identification of the corresponding microbial community’s phenotype.However,it is not yet clear how to select such a set of genes (features),and how accurate would it be for such a set of selected genes on prediction of microbial community’s phenotype.In this study,we have evaluated different machine learning methods,including feature selection methods and classification methods,for selection of biomarkers that could distinguish different samples.Then we proposed a machine learning framework,which could discover biomarkers for different microbial communities from the mining of metagenomic data.Given a set of features (genes) and their presence values in multiple samples,we first selected discriminative features as candidate by feature selection,and then selected the feature sets with low error rate and classification accuracies as biomarkers by classification method.We have selected whole genome sequencing data from simulation,public domain and in-house metagenomic data generation facilities.We tested the framework on prediction and evaluation of the biomarkers.Results have shown that the framework could select functional biomarkers with very high accuracy.Therefore,this framework would be a suitable tool to discover functional biomarkers to distinguish different microbial communities.
机译:由于无法分离和培养超过90%的微生物群落,因此宏基因组学方法已普遍用于分析整个微生物群落。随着宏基因组学样品的快速累积,现在人们很容易找到简单的生物标记物,尤其是功能强大的生物标记物。生物标志物,可以区分不同的宏基因组样品。下一代测序技术使宏基因组研究中可以检测非常准确的基因存在(丰度)值,并且可以将一组基因的存在/不存在或不同丰度值用作识别相应微生物群落表型的适当生物标记。但是,目前尚不清楚如何选择这样的一组基因(特征),以及这样一组选定的基因在预测微生物群落表型方面的准确性如何。在这项研究中,我们评估了不同的机器学习方法,包括特征选择方法和分类方法,以供选择。可以区分不同样本的生物标记物。然后,我们提出了一种机器学习框架,可以从宏基因组学数据的挖掘中发现不同微生物群落的生物标记物。考虑到一组特征(基因)及其在多个样本中的存在值,我们首先通过特征选择选择具有区别性的特征作为候选,然后通过分类方法选择具有低错误率和分类精度的特征集作为生物标志物。我们从模拟,公共领域和内部宏基因组学数据生成设施中选择了全基因组测序数据。结果表明,该框架可以非常准确地选择功能性生物标志物。因此,该框架将是发现功能性生物标志物以区分不同微生物群落的合适工具。

著录项

  • 来源
  • 会议地点 Xian(CN)
  • 作者单位

    Investigation Group of Molecular Virology, Immunology, Oncology Systems Biology, Center for Bioinformatics, Collegeof Life Science, and Research Laboratory of Virology, Immunology Bioinformatics, Department of Preventive VeterinaryMedicine, College of Veterinary Medicine,Northwest A F University, Yangling 712100, Xi’an City, Shaanxi, P.R. China;

    Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, P.R. China;

    Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, P.R. China;

    Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, P.R. China;

    Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, P.R. China;

    Investigation Group of Molecular Virology, Immunology, Oncology Systems Biology, Center for Bioinformatics, Collegeof Life Science, and Research Laboratory of Virology, Immunology Bioinformatics, Department of Preventive VeterinaryMedicine, College of Veterinary Medicine,Northwest A F University, Yangling 712100, Xi’an City, Shaanxi, P.R. China;

    Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, Qingdao, Shandong, P.R. China;

  • 会议组织
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生态系统生态学;
  • 关键词

    metagenomic; biomarker; machine learning; ReliefF; mRMR;

    机译:元基因组;生物标志物;机器学习;缓解; mEME;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号