首页> 中文期刊>长春理工大学学报(自然科学版) >基于MapReduce编程模型的改进KNN分类算法研究

基于MapReduce编程模型的改进KNN分类算法研究

     

摘要

采用一种属性约简算法,将待分类的数据样本进行两次约简处理--初次决策表属性约简和基于核属性值的二次约简.通过属性约简方法来删除数据集中的冗余数据,进而提高KNN算法的分类精度.在此基础上应用MapReduce并行编程模型,在Hadoop集群环境上实现并行化分类计算实验.实验结果表明,改进后的算法在集群环境下执行的效率得到很大提升,能够高效处理实验数据.实验执行的加速比也有明显提高.%An attribute reduction algorithm is proposed. The algorithm will be classified data samples for the two reduc-tion processing--attribute reduction of the initial decision table and second reduction based on kernel attribute value. The method of attribute reduction is to delete the redundant data, and then to improve the classification accuracy of KNN algorithm. On the basis of the application of the MapReduce parallel programming model, the parallel computing experiments are implemented in the Hadoop cluster environment. The experimental results show that the efficiency of the improved algorithm in the cluster environment has been greatly improved,which can effectively deal with the exper-imental data. Experimental implementation of the speedup is also significantly improved.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号