首页> 外文期刊>BMC Bioinformatics >Bag of Na?ve Bayes: biomarker selection and classification from genome-wide SNP data
【24h】

Bag of Na?ve Bayes: biomarker selection and classification from genome-wide SNP data

机译:一袋幼稚的贝叶斯:从全基因组SNP数据中选择生物标记物并进行分类

获取原文
           

摘要

BackgroundMultifactorial diseases arise from complex patterns of interaction between a set of genetic traits and the environment. To fully capture the genetic biomarkers that jointly explain the heritability component of a disease, thus, all SNPs from a genome-wide association study should be analyzed simultaneously.ResultsIn this paper, we present Bag of Na?ve Bayes (BoNB), an algorithm for genetic biomarker selection and subjects classification from the simultaneous analysis of genome-wide SNP data. BoNB is based on the Na?ve Bayes classification framework, enriched by three main features: bootstrap aggregating of an ensemble of Na?ve Bayes classifiers, a novel strategy for ranking and selecting the attributes used by each classifier in the ensemble and a permutation-based procedure for selecting significant biomarkers, based on their marginal utility in the classification process. BoNB is tested on the Wellcome Trust Case-Control study on Type 1 Diabetes and its performance is compared with the ones of both a standard Na?ve Bayes algorithm and HyperLASSO, a penalized logistic regression algorithm from the state-of-the-art in simultaneous genome-wide data analysis.ConclusionsThe significantly higher classification accuracy obtained by BoNB, together with the significance of the biomarkers identified from the Type 1 Diabetes dataset, prove the effectiveness of BoNB as an algorithm for both classification and biomarker selection from genome-wide SNP data.AvailabilitySource code of the BoNB algorithm is released under the GNU General Public Licence and is available at http://www.dei.unipd.it/~sambofra/bonb.html.
机译:背景多因素疾病是由一系列遗传特征与环境之间复杂的相互作用模式引起的。为了充分捕获共同解释疾病遗传力成分的遗传生物标记,因此,应同时分析来自全基因组关联研究的所有SNP。结果在本文中,我们提出了一种算法,即“幼稚贝叶斯袋(BoNB)”同时分析全基因组SNP数据进行遗传生物标记选择和受试者分类。 BoNB基于朴素贝叶斯(Na?ve Bayes)分类框架,并通过以下三个主要功能得以充实:朴素贝叶斯分类器集合的引导聚合,一种用于对集合中每个分类器使用的属性进行排名和选择的新颖策略,以及一个排列基于重要生物标志物在分类过程中的边际效用的基础程序。 BoNB已在Wellcome Trust病例对照研究中对1型糖尿病进行了测试,并将其性能与标准Naveve Bayes算法和HyperLASSO(一种先进的惩罚性Logistic回归算法)的性能进行了比较。结论:BoNB获得的明显更高的分类准确性,以及从1型糖尿病数据集中鉴定出的生物标志物的重要性,证明了BoNB作为从全基因组SNP进行分类和生物标志物选择的算法的有效性BoNB算法的data.Availability源代码是根据GNU通用公共许可证发布的,可从http://www.dei.unipd.it/~sambofra/bonb.html获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号