首页> 外文期刊>Applied stochastic models in business and industry >Facilitating high-dimensional transparent classification via empirical Bayes variable selection
【24h】

Facilitating high-dimensional transparent classification via empirical Bayes variable selection

机译:通过经验贝叶斯变量选择促进高维透明分类

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We present a two-step approach to classification problems in the large P, small N setting, where the number of predictors may be larger than the sample size. We assume that the association between the predictors and the class variable has an approximate linear-logistic form, but we allow the class boundaries to be nonlinear. We further assume that the number of true predictors is relatively small. In the first step, we use a binomial generalized linear model to identify which predictors are associated with each class and then restrict the data set to these predictors and run a nonlinear classifier, such as a random forest or a support vector machine. We show that, without the variable screening step, the classification performance of both the random forest and support vector machine is degraded when many among the P predictors are not related to the class.
机译:我们在大的P,小n设置中提出了一种两步的分类问题,其中预测器的数量可以大于样本大小。 我们假设预测器和类变量之间的关联具有近似的线性逻辑形式,但我们允许类边界是非线性的。 我们进一步假设真正的预测器数量相对较小。 在第一步中,我们使用二项式广义线性模型来识别哪些预测器与每个类相关联,然后将数据限制为这些预测器并运行非线性分类器,例如随机林或支持向量机。 我们表明,如果没有变量筛选步骤,当P预测器中的许多与类相关时,随机林和支持向量机的分类性能会降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号