首页> 外文期刊>International Journal of Web Based Communities >A novel method for clustering tweets in Twitter
【24h】

A novel method for clustering tweets in Twitter

机译:一种在Twitter中对推文进行聚类的新方法

获取原文
获取原文并翻译 | 示例
           

摘要

A popular social networking service called Twitter is used to post short messages that could be useful to someone in the world. These messages have been analysed by the researchers in different ways. This paper proposes a clustering technique to cluster the tweets in the Twitter. The basic aim of performing this clustering is to identify the groups of similar tweets posted and this information is useful to identify various user communities. These user communities can be recommended to the advertisers in Twitter by matching their topic of interest with the advertisers' field. Suffix Tree Clustering (STC) algorithm is the core web documents clustering algorithm which groups similar documents into clusters by constructing suffix tree. We used STC along with semantic similarity among the posted tweets to identify the topics of interest. The proposed method is compared with STC and Lingo algorithms using intra-cluster distance and inter-cluster distance. Results show that the proposed method performs better than the existing methods with 10.59% reduction in the intra-cluster distance value and 44.99% increase in the inter-cluster distance value.
机译:一种流行的社交网络服务,称为Twitter,用于发布可能对世界各地的人有用的短消息。研究人员已以不同方式分析了这些消息。本文提出了一种将Twitter中的推文进行聚类的聚类技术。执行此群集的基本目的是识别发布的类似推文的组,并且此信息对于识别各种用户社区很有用。通过将他们感兴趣的主题与广告商的字段进行匹配,可以向Twitter中的广告商推荐这些用户社区。后缀树聚类(STC)算法是核心的Web文档聚类算法,通过构造后缀树将相似的文档分为几类。我们使用STC以及已发布推文之间的语义相似性来识别感兴趣的主题。将所提出的方法与使用群集内距离和群集间距离的STC和Lingo算法进行了比较。结果表明,所提出的方法比现有方法具有更好的性能,集群内距离值减少了10.59%,集群间距离值增加了44.99%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号