...
首页> 外文期刊>Applied Artificial Intelligence >ROBUST FEATURE SELECTION TECHNIQUE USING RANK AGGREGATION
【24h】

ROBUST FEATURE SELECTION TECHNIQUE USING RANK AGGREGATION

机译:基于秩聚合的鲁棒特征选择技术

获取原文
获取原文并翻译 | 示例

摘要

Although feature selection is a well-devebped research area, there is an ongoing need to develop methods to make classifiers more efficient. One important challenge is the lack of a universal feature selection technique that produces similar outcomes with all types of classifiers. This is because all feature selection techniques have individual statistical biases, whereas classifiers exploit different statistical properties of data for evaluation. In numerous situations, this can put researchers into dilemma with regard to which feature selection method and classifiers to choose from a vast range of choices. In this article, we propose a technique that aggregates the consensus properties of various feature selection methods in order to develop a more optimal solution. The ensemble nature of our technique makes it more robust across various classifiers. In other words, it is stable toward achieving similar and, ideally, higher classification accuracy across a wide variety of classifiers. We quantify this concept of robustness with a measure knmon as the robustness index (RI). We perform an extensive empirical evaluation of our technique on eight datasets with different dimensions, including arrythmia, lung cancer, Madelon, mfeat-fourier, Internet ads, leukemia-3c, embryonal tumor, and a real-xoorld dataset, vis., acute myeloid leukemia (AML). We demonstrate not only that our algorithm is more robust, but also that, compared with other techniques, our algorithm improves the classification accuracy by approximately 3-4 % in a dataset with fewer than 500 features and by more than 5% in a dataset with more than 500 features, across a wide range of classifiers.
机译:尽管特征选择是一个发展良好的研究领域,但仍需要开发使分类器更有效的方法。一个重要的挑战是缺乏一种通用的特征选择技术,该技术在所有类型的分类器上都能产生相似的结果。这是因为所有特征选择技术都有各自的统计偏差,而分类器则利用数据的不同统计属性进行评估。在许多情况下,这可能会使研究人员陷入困境,无法从各种各样的选择中选择哪种特征选择方法和分类器。在本文中,我们提出了一种聚合各种特征选择方法的共识属性的技术,以便开发出更优化的解决方案。我们技术的整体性质使它在各种分类器上都更加强大。换句话说,它对于在各种分类器上实现相似且理想地更高的分类精度是稳定的。我们将度量knmon作为鲁棒性指数(RI)来量化鲁棒性的概念。我们在八种不同维度的数据集上对我们的技术进行了广泛的实证评估,包括心律失常,肺癌,马德隆,mfeat-傅里叶,互联网广告,白血病3c,胚胎肿瘤和真实粒数据集(相对于急性髓样)白血病(AML)。我们不仅证明了我们的算法更健壮,而且与其他技术相比,我们的算法在少于500个特征的数据集中将分类准确性提高了3-4%,在具有500个特征的数据集中则提高了5%以上广泛分类器中的500多个功能。

著录项

  • 来源
    《Applied Artificial Intelligence 》 |2014年第3期| 243-257| 共15页
  • 作者单位

    College of Science and Engineering, University of Minnesota at Twin Cities, Department of Computer Science, 4-192 Keller Hall, 200 Union St., Minneapolis, MN, 55455 USA;

    Masonic Cancer Center, University of Minnesota at Twin Cities, Twin Cities,Minnesota, USA;

    College of Science and Engineering, University of Minnesota at Twin Cities,Twin Cities, Minnesota, USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号