【24h】

Feature selection technology based on sample unbiased evaluation

机译:基于样本无偏评估的特征选择技术

获取原文

摘要

This paper focuses on the feature selection methods for unbalanced data sets which have variant sizes of classes. ReliefF has proved to be a successful method for selecting irrelevant features, whereas it is considered as a biased approach for the unbalanced data sets. This paper describes an effective fair method to overcome the defect. Furthermore, against the sensitivity of ReliefF to noisy or irrelevant features when selecting k nearest samples, feature distance is proposed to substitute for the Euclidean distance. Experiments on manual data and UCI data sets indicated that the improved method works better than ReliefF and InfoGain when used as a preprocessing step for naive Bayes and C4.5.
机译:本文着重于针对具有不等类别大小的不平衡数据集的特征选择方法。事实证明,ReliefF是选择不相关特征的成功方法,而对于不平衡数据集,它被认为是一种有偏见的方法。本文介绍了一种有效的公平方法来克服该缺陷。此外,针对ReliefF在选择k个最近样本时对嘈杂或不相关特征的敏感性,提出了特征距离替代欧几里得距离的方法。对手动数据和UCI数据集进行的实验表明,该改进方法在用作朴素贝叶斯和C4.5的预处理步骤时,其效果优于ReliefF和InfoGain。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号