首页> 中文期刊>内蒙古工业大学学报(自然科学版) >基于Single-Pass算法的网络舆情文本增量聚类算法研究

基于Single-Pass算法的网络舆情文本增量聚类算法研究

     

摘要

随着信息技术的快速发展,互联网成为主要社会信息传播方式,网络舆情的影响力不断扩大.网络舆情具有内容丰富、信息量大且相关话题种类繁多等特点,虽然聚类技术可以用来发现网民关注的话题,但是传统聚类算法还无法直接应用于海量动态网络舆情监控.本文根据网络舆情动态演化特点,研究高效的增量文本聚类算法,选取经典的增量聚类算法 Single-Pass 进行了改进,解决了该算法输入数据顺序敏感问题及求解效率问题.实验结果表明,在海量舆情文本聚类过程中,该方法可以大大提升舆情文本聚类效率,同时聚类精度未受到影响.%With the rapid development of information technology,the Internet has become the main mode of social information dissemination,and the influence of internet public opinion has been expan-ding.The internet public opinion has the characteristics of rich content,large amount of information and a great variety of related topics,While the text clustering technology can be used to find the topic of Internet users concerned,but the traditional clustering algorithm can not be directly applied to the massive dynamic network public opinion monitoring.According to the characteristics of the evolution of network public opinion,we should research efficient incremental text clustering algorithm.A classi-cal incremental clustering algorithm,Single-Pass,is selected from a number of existing clustering algo-rithms,Aiming at the problem of the sensitivity and efficiency of the algorithm to the input data se-quence in the clustering process,an improved Single-Pass clustering algorithm.Experimental results show that this method can greatly improve the efficiency of text clustering,while ensuring the accuracy of text clustering is not affected.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号