首页> 外文会议>IEEE International Conference on High Performance Computing and Communications >Microblog Hotspot Discovery Method Based on Improved K-Means Algorithm
【24h】

Microblog Hotspot Discovery Method Based on Improved K-Means Algorithm

机译:基于改进的K均值算法的MicroBlog热点发现方法

获取原文
获取外文期刊封面目录资料

摘要

The K-means algorithm is one of the most frequently used clustering algorithms in hot topic discovery. However, due to its shortcomings such as the number of clusters K value and easy to fall into local optimum, the clustering accuracy is not high, which directly affects the quality of hotspot discovery. This paper proposes an improved K-means algorithm to achieve fast clustering of microblog texts. Combining the high-frequency words and similarities of the microblog texts to perform single-pass clustering, the K number of clusters and the initial clustering center are obtained, which solves the problem that the K-means algorithm is too sensitive to the K value and the initial center. Through experimental comparison and analysis, it makes up for the shortcomings of K-means algorithm, and effectively improves the efficiency and accuracy of clustering. Applying it to the hot topic discovery model, the effectiveness of the hot spot discovery model based on the improved K-means algorithm is verified by experiments, and it has a high accuracy.
机译:在K-means算法是在热门话题发现最常用的聚类算法之一。然而,由于它的缺点,例如簇K值和容易陷入局部最优的数量,聚类精度不高,这直接影响热点发现的质量。本文提出了一种改进的K-means算法来实现微博文本的快速聚类。组合的微博文本的高频词和相似性进行单遍聚类,获得簇的K个和初始聚类中心,解决了问题,即K-means算法是将K值过于敏感和初始中心。通过实验的比较和分析,它弥补了K-means算法的缺点,并有效地提高了效率和聚类的准确度。它应用到的热点话题发现模型的基础上,改进的K-means算法的热点发现模型的有效性进行了实验验证,并具有较高的精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号