首页> 外文期刊>人工知能学会論文誌 >Improving Performance of the k-Nearest Neighbor Classifier by Combining Feature Selection with Feature Weighting
【24h】

Improving Performance of the k-Nearest Neighbor Classifier by Combining Feature Selection with Feature Weighting

机译:通过将特征选择与特征权重相结合来提高k最近邻分类器的性能

获取原文
获取原文并翻译 | 示例
           

摘要

The k-nearest neighbor (k-NN) classification is a simple and effective classification approach. However, it suffers from over-sensitivity problem due to irrelevant and noisy features. There are two ways to relax such sensitivity. One is to assign each feature a weight, and the other way is to select a subset of relevant features. Existing researches showed that both approaches can improve generalization accuracy, but it is impossible to predict which one is better for a specific dataset. In this paper, we propose an algorithm to improve the effectiveness of k-NN by combining these two approaches. Specifically, we select all relevant features firstly, and then assign a weight to each relevant feature. Experiments have been conducted on 14 datasets from the UCI Machine Learning Repository, and the results show that our algorithm achieves the highest accuracy or near to the highest accuracy on all test datasets. It increases generalization accuracy 8.68% on the average. It also achieves higher generalization accuracy compared with well-known machine learning algorithm IB1-4 and C4.5.
机译:k最近邻(k-NN)分类是一种简单有效的分类方法。然而,由于不相关和嘈杂的特征,它遭受了过度敏感的问题。有两种方法可以放松这种敏感性。一种是为每个特征分配权重,另一种方式是选择相关特征的子集。现有研究表明,这两种方法都可以提高泛化精度,但是无法预测哪种方法更适合特定数据集。在本文中,我们提出了一种通过结合这两种方法来提高k-NN有效性的算法。具体来说,我们首先选择所有相关特征,然后为每个相关特征分配权重。对UCI机器学习存储库中的14个数据集进行了实验,结果表明,我们的算法在所有测试数据集上均达到了最高准确性或接近最高准确性。平均将泛化精度提高8.68%。与众所周知的机器学习算法IB1-4和C4.5相比,它还具有更高的泛化精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号