首页> 中文期刊> 《计算机工程与科学》 >基于MapReduce的并行MRACO-PAM聚类算法

基于MapReduce的并行MRACO-PAM聚类算法

         

摘要

聚类分析是数据处理算法中常用的方法,PAM算法自提出以来便成为了最常使用的聚类算法之一.虽然传统PAM算法解决了K-Means算法在聚类过程中对脏数据敏感的问题,但是传统PAM算法存在收敛速度慢、处理大数据集效率不高等问题.针对这些问题,利用蚁群搜索机制来增强PAM算法的全局搜索能力和局部探索能力,并基于MapReduce并行编程框架提出MRACO-PAM算法来实现并行化计算,并进行实验.实验结果表明,基于MapReduce框架的并行MRACO-PAM聚类算法的收敛速度得到了改善,具备处理大规模数据的能力,而且具有良好的可扩展性.%Clustering analysis is one of the most commonly used data processing algorithms,and the partitioning around medoid (PAM) has been one of the most popular clustering algorithms since it was proposed in 1990.The PAM clustering algorithm solves the problem that the K-Means algorithm encounters when processing outlier data,which is sensitive to dirty data in clustering process.However,the original PAM's convergence speed is slow and it works inefficiently for large datasets due to its time complexity.To address this problem,we enhance the global and local searching capabilities of the PAM by taking advantage of the ant colony algorithm,and propose a parallel MRACO-PAM clustering algorithm based on MapReduce programming framework.Experimental results demonstrate that the parallel MRACO-PAM algorithm based on MapReduce improves the convergence speed and is capable of dealing with large-scale data with good scalability.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号