首页> 外文会议>International Conference on Cloud Computing and Big Data Analysis >Hot topic extraction based on Chinese Microblog's Features topic model
【24h】

Hot topic extraction based on Chinese Microblog's Features topic model

机译:基于中国微博功能主题提取主题模型

获取原文

摘要

Microblog, with its wide participation and convenience, has changed the way that people get news about current events. In recent years, lots of breaking news and hot topics are released by microblog platform firstly, as well as its much wider and more distribution than traditional media platform. Extracting these useful information in real-time will help us to grasp the latest and hottest topics which currently discussed by microblog users. However, due to the micrblog content is short and sparse, traditional topic extraction methods can not be used directly. In this paper, we propose a new topic model named Microblog Features Latent Dirichlet Allocation (MF-LDA) to extract microblog topics. We incorporate five microblog's unique features: support, comment, retweet, publish time and user authority into LDA model. These features are utilized to compute each microblog's attention value, authority value and word frequency. And the higher feature value of a term, the greater probability of a hot topic it be. Experimental results on real datasets demonstrated our MF-LDA model is more efficient and accurate than other methods in hot topic extraction.
机译:凭借广泛的参与和便利,微博改变了人们获取当前事件的新闻的方式。近年来,首先由微博平台发布了许多突发新闻和热门话题,而且它比传统媒体平台更广泛,更广泛地分布。实时提取这些有用的信息将有助于我们掌握目前由MicroBlog用户讨论的最新和最热门的主题。但是,由于MICLLOG内容短而稀疏,传统主题提取方法无法直接使用。在本文中,我们提出了一个名为微博新的主题模型特点隐含狄利克雷分布(MF-LDA)提取微博主题。我们融合了五个微博的唯一功能:支持,评论,转发,发布时间和用户权限到LDA模型。这些功能用于计算每个微博的注意力值,权限值和字频率。并且术语的较高特征值,它是热门话题的概率更大。实验结果对实时数据集显示了我们的MF-LDA模型比热门话题中的其他方法更高效和准确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号