首页> 外文期刊>Computational statistics & data analysis >Estimating mutual information for feature selection in the presence of label noise
【24h】

Estimating mutual information for feature selection in the presence of label noise

机译:在标签噪声存在的情况下估计用于特征选择的互信息

获取原文
获取原文并翻译 | 示例
           

摘要

A way to achieve feature selection for classification problems polluted by label noise is proposed. The performances of traditional feature selection algorithms often decrease sharply when some samples are wrongly labelled. A method based on a probabilistic label noise model combined with a nearest neighbours-based entropy estimator is introduced to robustly evaluate the mutual information, a popular relevance criterion for feature selection. A backward greedy search procedure is used in combination with this criterion to find relevant sets of features. Experiments establish that (i) there is a real need to take a possible label noise into account when selecting features and (ii) the proposed methodology is effectively able to reduce the negative impact of the mislabelled data points on the feature selection process.
机译:提出了一种针对标签噪声污染的分类问题实现特征选择的方法。当某些样本被错误标记时,传统特征选择算法的性能通常会急剧下降。引入了一种基于概率标签噪声模型并结合基于最近邻域的熵估计器的方法,以稳健地评估互信息,这是特征选择的一种流行的相关性准则。将反向贪婪搜索过程与此准则结合使用以找到相关的特征集。实验表明,(i)在选择特征时确实需要考虑可能的标签噪声,并且(ii)所提出的方法能够有效地减少贴错标签的数据点对特征选择过程的负面影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号