首页> 外文期刊>The Journal of Systems and Software >Noisy data elimination using mutual k-nearest neighbor for classification mining
【24h】

Noisy data elimination using mutual k-nearest neighbor for classification mining

机译:使用相互k最近邻进行分类数据挖掘的噪声数据消除

获取原文
获取原文并翻译 | 示例

摘要

k nearest neighbor (kNN) is an effective and powerful lazy learning algorithm, notwithstanding its easy-to-implement. However, its performance heavily relies on the quality of training data. Due to many complex real-applications, noises coming from various possible sources are often prevalent in large scale databases. How to eliminate anomalies and improve the quality of data is still a challenge. To alleviate this problem, in this paper we propose a new anomaly removal and learning algorithm under the framework of kNN. The primary characteristic of our method is that the evidence of removing anomalies and predicting class labels of unseen instances is mutual nearest neighbors, rather than k nearest neighbors. The advantage is that pseudo nearest neighbors can be identified and will not be taken into account during the prediction process. Consequently, the final learning result is more creditable. An extensive comparative experimental analysis carried out on UCI datasets provided empirical evidence of the effectiveness of the proposed method for enhancing the performance of the k-NN rule.
机译:尽管k最近邻(kNN)易于实现,但它是一种有效且功能强大的惰性学习算法。但是,其性能在很大程度上取决于训练数据的质量。由于许多复杂的实际应用,在大型数据库中,来自各种可能来源的噪声通常很普遍。如何消除异常并提高数据质量仍然是一个挑战。为了缓解这个问题,本文提出了一种新的在kNN框架下的异常消除和学习算法。我们方法的主要特征是消除异常并预测未见实例的类别标签的证据是相互最近的邻居,而不是k最近的邻居。优点是可以识别伪最近邻居,并且在预测过程中不会将其考虑在内。因此,最终的学习结果更加可信。对UCI数据集进行的广泛比较实验分析提供了所提出方法增强k-NN规则性能的有效性的经验证据。

著录项

  • 来源
    《The Journal of Systems and Software》 |2012年第5期|p.1067-1074|共8页
  • 作者

    Huawen Liu; Shichao Zhang;

  • 作者单位

    Department of Computer Science. Zhejiang Normal University, China,Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, China;

    College of Computer Science and Information Technology, Cuangxi Normal University, China,Faculty of Engineering and Information Technology, University of Technology, Sydney, Australia;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    data mining; pattern classification; kNN; mutual nearest neighbor; data reduction;

    机译:数据挖掘;模式分类kNN;相互最近的邻居;数据缩减;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号