首页> 中文期刊>计算机工程与应用 >一种面向大规模微博数据的话题挖掘方法

一种面向大规模微博数据的话题挖掘方法

     

摘要

随着微博的日趋流行,新浪微博已成为公众获取和传播信息的重要平台之一,针对微博数据的话题挖掘也成为当前的研究热点。提出一个面向大规模微博数据的话题挖掘方法。首先对大规模微博数据进行分析,基于Bloom Filter算法对数据进行去重处理,针对微博的特有结构,对文本进行预处理,提出改进的LDA主题模型So-cial Network LDA(SNLDA),采用吉布斯采样法进行模型推导,挖掘出微博话题。实验结果表明,方法能有效地从大规模微博数据中挖掘出话题信息。%With the daily popularity of microblog, Sina Weibo has become one of the important public access to and dis-semination of information platform, microblog topic mining has become a current research focuses. This paper proposes a topic mining method on massive Social Network data. This paper analyzes the large-scale microblog data, uses Bloom Filter algorithm to eliminate the duplicate data. In view of the special structure of microblog, filter the text. SNLDA, an improved LDA topic model is proposed in this paper, Gibbs sampling is chosen to deduce the model, which can mine the microblog topics. The experimental results show that the method can effectively excavate the topics from the large-scale microblog data.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号