首页> 外文期刊>Neurocomputing >A Genetic Programming approach for feature selection in highly dimensional skewed data
【24h】

A Genetic Programming approach for feature selection in highly dimensional skewed data

机译:用于高维偏斜数据特​​征选择的遗传编程方法

获取原文
获取原文并翻译 | 示例

摘要

High dimensionality, also known as the curse of dimensionality, is still a major challenge for automatic classification solutions. Accordingly, several feature selection (FS) strategies have been proposed for dimensionality reduction over the years. However, they potentially perform poorly in face of unbalanced data. In this work, we propose a novel feature selection strategy based on Genetic Programming, which is resilient to data skewness issues, in other words, it works well with both, balanced and unbalanced data. The proposed strategy aims at combining the most discriminative feature sets selected by distinct feature selection metrics in order to obtain a more effective and impartial set of the most discriminative features, departing from the hypothesis that distinct feature selection metrics produce different (and potentially complementary) feature space projections. We evaluated our proposal in biological and textual datasets. Our experimental results show that our proposed solution not only increases the efficiency of the learning process, reducing up to 83% the size of the data space, but also significantly increases its effectiveness in some scenarios. (C) 2017 Elsevier B.V. All rights reserved.
机译:高维,也称为维的诅咒,仍然是自动分类解决方案的主要挑战。因此,多年来已经提出了几种特征选择(FS)策略来降低尺寸。但是,面对不平衡的数据,它们的性能可能很差。在这项工作中,我们提出了一种基于遗传编程的新颖特征选择策略,该策略可以应对数据偏斜问题,换句话说,它既适用于平衡数据又适用于非平衡数据。所提出的策略旨在将由不同特征选择度量选择的最具区分性特征集组合在一起,以获得更有效和公正的最具区分性特征集,从而摆脱以下假设:不同特征选择度量会产生不同(且可能是互补的)特征空间投影。我们在生物学和文本数据集中评估了我们的建议。我们的实验结果表明,我们提出的解决方案不仅提高了学习过程的效率,减少了多达83%的数据空间,而且在某些情况下还大大提高了其有效性。 (C)2017 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2018年第17期|554-569|共16页
  • 作者单位

    Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil;

    Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil;

    Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil;

    Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil;

    Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil;

    Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil;

    Univ Fed Minas Gerais, Dept Comp Sci, Belo Horizonte, MG, Brazil;

    Univ Fed Sao Joao del Rei, Dept Comp Sci, Sao Joao Del Rei, MG, Brazil;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Feature selection; Classification; Genetic Programming;

    机译:特征选择;分类;遗传规划;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号