首页> 外文OA文献 >Combining clustering of variables and feature selection using random forests
【2h】

Combining clustering of variables and feature selection using random forests

机译:使用随机组合变量聚类和特征选择  森林:CoV / VsURF程序

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

High-dimensional data classification is a challenging problem. A standardapproach to tackle this problem is to perform variables selection, e.g. usingstep-wise or LASSO procedures. Another standard way is to perform dimensionreduction, e.g. by Principal Component Analysis or Partial Least Squareprocedures. The approach proposed in this paper combines both dimensionreduction and variables selection. First, a procedure of clustering ofvariables is used to built groups of correlated variables in order to reducethe redundancy of information. This dimension reduction step relies on the Rpackage ClustOfVar which can deal with both numerical and categoricalvariables. Secondly, the most relevant synthetic variables (which are numericalvariables summarizing the groups obtained in the first step) are selected witha procedure of variable selection using random forests, implemented in the Rpackage VSURF. Numerical performances of the proposed methodology calledCoV/VSURF are compared with direct applications of VSURF or random forests onthe original $p$ variables. Improvements obtained with the CoV/VSURF procedureare illustrated on two simulated mixed datasets (cases $nextgreater{}p$ and$nextless{}extless{}p$) and on a real proteomic dataset.
机译:高维数据分类是一个具有挑战性的问题。解决这个问题的标准图案是执行变量选择,例如,使用step-wise或lasso程序。另一种标准方法是执行二维测量,例如,通过主成分分析或部分最小二乘性。本文提出的方法结合了维度和变量选择。首先,使用variables群集的过程用于构建相关变量组,以便将信息的冗余冗余。该尺寸还原步骤依赖于可以处理数字和分类的RPackage Clustofvar。其次,使用随机林的可变选择过程选择最相关的合成变量(这是概述第一步中获得的组的数值偏离的数值偏离,在RPackage VSurf中实现。将拟议方法的数值表演称为COV / VSURF的直接应用于原始$ P $变量的VSURF或随机林。使用两个模拟混合数据集中所示的COV / VSURF程序(案例$ N TextGreater {} P $和$ N Textless {} TextLess {})以及真正的蛋白质组学数据集中而获得的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号