...
首页> 外文期刊>Computational statistics >Bayesian variable selection in multinomial probit model for classifying high-dimensional data
【24h】

Bayesian variable selection in multinomial probit model for classifying high-dimensional data

机译:多项式概率模型中的贝叶斯变量选择用于高维数据分类

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Selecting a small number of relevant genes for classification has received a great deal of attention in microarray data analysis. While the development of methods for microarray data with only two classes is relevant, developing more efficient algorithms for classification with any number of classes is important. In this paper, we propose a Bayesian stochastic search variable selection approach for multi-class classification, which can identify relevant genes by assessing sets of genes jointly. We consider a multinomial probit model with a generalized g-prior for the regression coefficients. An efficient algorithm using simulation-based MCMC methods are developed for simulating parameters from the posterior distribution. This algorithm is robust to the choice of initial value, and produces posterior probabilities of relevant genes for biological interpretation. We demonstrate the performance of the approach with two well-known gene expression profiling data: leukemia data, lymphoma data, SRBCTs data and NCI60 data. Compared with other classification approaches, our approach selects smaller numbers of relevant genes and obtains competitive classification accuracy based on obtained results.
机译:选择少量相关基因进行分类已在微阵列数据分析中引起了广泛的关注。虽然开发仅具有两个类别的微阵列数据的方法是相关的,但开发用于分类多个类别的更有效算法非常重要。在本文中,我们提出了一种用于多类分类的贝叶斯随机搜索变量选择方法,该方法可以通过共同评估基因集来识别相关基因。我们考虑回归系数具有广义g优先级的多项式概率模型。开发了一种使用基于仿真的MCMC方法的有效算法,用于从后验分布中模拟参数。该算法对初始值的选择具有鲁棒性,并产生相关基因的后验概率以用于生物学解释。我们用两个著名的基因表达谱数据证明了该方法的性能:白血病数据,淋巴瘤数据,SRBCTs数据和NCI60数据。与其他分类方法相比,我们的方法选择了较少数量的相关基因,并根据获得的结果获得了具有竞争力的分类准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号