首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >A Training Data Set Cleaning Method by Classification Ability Ranking for the k -Nearest Neighbor Classifier
【24h】

A Training Data Set Cleaning Method by Classification Ability Ranking for the k -Nearest Neighbor Classifier

机译:培训数据集清洁方法通过分类能力排名为k-nealest邻居分类器

获取原文
获取原文并翻译 | 示例

摘要

The $k$ -nearest neighbor (KNN) rule is a successful technique in pattern classification due to its simplicity and effectiveness. As a supervised classifier, KNN classification performance usually suffers from low-quality samples in the training data set. Thus, training data set cleaning (TDC) methods are needed for enhancing the classification accuracy by cleaning out noisy, or even wrong, samples in the original training data set. In this paper, we propose a classification ability ranking (CAR)-based TDC method to improve the performance of a KNN classifier, namely CAR-based TDC method. The proposed classification ability function ranks a training sample in terms of its contribution to correctly classify other training samples as a KNN through the leave-one-out (LV1) strategy in the cleaning stage. The training sample that likely misclassifies the other samples during the KNN classifications according to the LV1 strategy is considered to have lower classification ability and will be cleaned out from the original training data set. Extensive experiments, based on ten real-world data sets, show that the proposed CAR-based TDC method can significantly reduce the classification error rates of KNN-based classifiers, while reducing computational complexity thanks to a smaller cleaned training data set.
机译:$ k $ $-nearest邻居(knn)规则是由于其简单性和有效性而成功的模式分类技术。作为监督分类器,KNN分类性能通常遭受训练数据集中的低质量样本。因此,需要通过清理原始训练数据集中的噪音或甚至错误的样本来提高分类准确性所需的训练数据集清洁(TDC)方法。在本文中,我们提出了一种分类能力排名(汽车)的TDC方法,以提高KNN分类器的性能,即基于汽车的TDC方法。该拟议的分类能力函数在其贡献方面对培训样本进行排名,以便通过清洁阶段的休假(LV1)策略正确地将其他培训样本正确分类为KNN。可能会在根据LV1策略的KNN分类期间错误分类其他样本的训练样本被认为是具有较低的分类能力,并将从原始培训数据集中清除。广泛的实验,基于十个现实世界数据集,表明所提出的基于汽车的TDC方法可以显着降低基于KNN的分类器的分类误差率,同时通过较小的清洁训练数据集来降低计算复杂性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号