首页> 中文期刊> 《信息网络安全》 >大数据处理中混合型聚类算法的研究与实现

大数据处理中混合型聚类算法的研究与实现

         

摘要

With the rapid development of information technology, the era of big data has arrived, analysis of the data has become the focus of research, data mining is to become a top priority, and has been extensively studied. This paper aims to study the clustering algorithm, puts forward a hybrid clustering algorithm which integrates the clustering algorithm based on partition and the clustering algorithm based on hierarchical. The algorithm can avoid the problem of randomly chosen initial cluster centers, and uses the clustering algorithm based on partition to initialize the data, then uses the clustering algorithm based on hierarchical to analysis the post-processed data from the bottom to the top, which can greatly enhance clustering speed. The algorithm can combine the advantages of this two kinds of traditional clustering algorithm, eliminate the deifciencies, achieve complementary advantages, and improve the operating efficiency of the algorithm without loss of accuracy. Finally, simulation experiments conifrm the effectiveness and feasibility of the proposed algorithm through the R language tools.%随着信息技术的飞速发展,大数据时代已经来临,对数据的分析与处理成为目前研究的重点,数据挖掘技术更是成为了重中之重,被广泛研究与应用.文章在研究聚类算法的基础上,具体研究了基于划分的聚类算法以及自下而上的基于层次的聚类算法,通过将两种算法优化后再进行融合提出了一种混合型聚类算法.该算法能够避免划分算法中随机选取初始聚类中心的问题,使用基于划分的聚类算法对数据集进行初始化,然后对处理后的数据集进行自下而上的基于层次的聚类分析,最终能够得到理想的分析结果.该算法能够综合两类传统聚类算法的优点,摒除不足之处,做到优势互补,在不损失准确性的基础上提高了算法的运行效率.最后通过R语言工具进行实验仿真,证实了文中提出的混合型聚类算法的有效性以及可行性.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号