首页> 外文期刊>Expert Systems with Application >Sequential random k-nearest neighbor feature selection for high-dimensional data
【24h】

Sequential random k-nearest neighbor feature selection for high-dimensional data

机译:高维数据的顺序随机k最近邻特征选择

获取原文
获取原文并翻译 | 示例

摘要

Feature selection based on an ensemble classifier has been recognized as a crucial technique for modeling high-dimensional data. Feature selection based on the random forests model, which is constructed by aggregating multiple decision tree classifiers, has been widely used. However, a lack of stability and balance in decision trees decreases the robustness of random forests. This limitation motivated us to propose a feature selection method based on newly designed nearest-neighbor ensemble classifiers. The proposed method finds significant features by using an iterative procedure. We performed experiments with 20 datasets of microarray gene expressions to examine the property of the proposed method and compared it with random forests. The results demonstrated the effectiveness and robustness of the proposed method, especially when the number of features exceeds the number of observations. (C) 2014 Elsevier Ltd. All rights reserved.
机译:基于集成分类器的特征选择已被认为是建模高维数据的关键技术。通过聚合多个决策树分类器构造的基于随机森林模型的特征选择已被广泛使用。但是,决策树缺乏稳定性和平衡性会降低随机森林的鲁棒性。这种局限性促使我们提出一种基于新设计的最近邻集合分类器的特征选择方法。所提出的方法通过使用迭代过程来发现重要特征。我们对20个微阵列基因表达数据集进行了实验,以检验该方法的性质,并将其与随机森林进行比较。结果证明了该方法的有效性和鲁棒性,特别是当特征数量超过观测数量时。 (C)2014 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号