首页> 外文会议>International Symposium on Computational Intelligence and Design >Micro-blog Short Text Clustering Algorithm Based on Bootstrapping
【24h】

Micro-blog Short Text Clustering Algorithm Based on Bootstrapping

机译:基于自举的微博短文本聚类算法

获取原文

摘要

In micro-blog short text clustering, the amount of text information contained in micro-blog short text is small, with timeliness, sparseness and singularity, so the artificial selection of attribute words has some limitations. This paper puts forward the feature extraction of text information in micro-blog using Bootstrapping algorithm, it can choose the higher theme information reflect the characteristics of words, and then use the improved TFIDF algorithm to calculate the weight of micro-blog based text clustering. Finally, K-means clustering algorithm is used to cluster micro-blog short text. The experimental results show that the clustering algorithm of micro-blog short text proposed in this paper is better than other algorithms, which improves the clustering effect of micro-blog short text.
机译:在微博短文本聚类中,微博短文本中包含的文本信息量少,具有时效性,稀疏性和奇异性,因此属性词的人为选择存在一定的局限性。提出了利用Bootstrapping算法提取微博中文本信息的特征,可以选择较高的主题信息来反映单词的特征,然后使用改进的TFIDF算法来计算基于微博的文本聚类权重。最后,使用K-means聚类算法对微博短文本进行聚类。实验结果表明,本文提出的微博短文本聚类算法优于其他算法,提高了微博短文本的聚类效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号